Upload
kmeena73
View
218
Download
0
Embed Size (px)
Citation preview
8/10/2019 Wayne Daniel Ppt
1/186
/04/34
Lectures of Stat -145
(Biostatistics)
Text bookBiostatisticsBasic Concepts and Methodology for the Health Sciences
By
Wayne W. Daniel
Prepared By:
Sana A. [email protected]
Chapter 1
Introduction To
Biostatistics
Text Book : Basic Concepts andMethodology for the Health Sciences 2
8/10/2019 Wayne Daniel Ppt
2/186
/04/34
Key words :
Statistics , data , Biostatistics,
Variable ,Population ,Sample
Text Book : Basic Concepts andMethodology for the Health Sciences 3
IntroductionSome Basic concepts
Statisticsis a field of studyconcerned
with
1-collection, organization, summarizationand analysisof data.
2-drawing of inferencesabout a body of
data when only a part of the data is
observed.
Statisticianstry to interpretand
communicatethe results to others.
Text Book : Basic Concepts andMethodology for the Health Sciences 4
8/10/2019 Wayne Daniel Ppt
3/186
/04/34
Data:
The raw material of Statisticsis data.We may define data as figures. Figures
result from the process of countingorfrom taking a measurement.
For example
- When a hospital administrator counts thenumber of patients (counting).
- When a nurse weighs a patient
(measurement)
Text Book : Basic Concepts andMethodology for the Health Sciences 5
* A variable:
It is a characteristic that takes on differentvalues in different persons, places, or
things.
For example:
- heart rate,
- the heights of adult males,- the weights of preschool children,
- the ages of patients seen in a dental clinic.
Text Book : Basic Concepts andMethodology for the Health Sciences 6
8/10/2019 Wayne Daniel Ppt
4/186
/04/34
* Biostatistics:
The tools of statisticsare employed in manyfields:
business, education, psychology, agriculture,
economics, etc.
When the data analyzed are derived from
the biological scienceand medicine,
we use the term biostatisticsto distinguish
this particular application of statistical
tools and concepts.
Text Book : Basic Concepts andMethodology for the Health Sciences 7
Quantitative VariablesIt can be measured in
the usual sense.
For example:
- the heights of adultmales,
- the weights ofpreschool children,
- the ages of patientsseen in a
- dental clinic.
Qualitative Variables
Many characteristics are not
capable of being measured.
Some of them can be
ordered (called ordinal)
and Some of them cant be
ordered (called nominal).
For example:
- classification of people into
socio-economic groups
-.hair color
Text Book : Basic Concepts andMethodology for the Health Sciences 8
Types of variablesQuantitative
Qualitative
8/10/2019 Wayne Daniel Ppt
5/186
/04/34
A discrete variableis characterized by gaps
or interruptions in thevalues that it canassume.
For example:
- The number of dailyadmissions to ageneral hospital,
- The number of
decayed, missing orfilled teeth per child
in an elementary
- school.
A continuous variablecan assume any value within a
specified relevant interval ofvalues assumed by the variable.
For example:
- Height,
- weight,
- skull circumference.
No matter how close together the
observed heights of two people,we can find another personwhose height falls somewhere inbetween.
Text Book : Basic Concepts andMethodology for the Health Sciences 9
Types of quantitative variables
Discrete
Continuous
As the name implies itconsist of namingor classifies intovarious mutuallyexclusive categories
For example:
- Male - female- Sick - well
- Marriedsingle -divorced
.Whenever qualitativeobservation
Can be ranked or orderedaccording to somecriterion.
For example:- Blood pressure level
(high-good-low)
- Grades
(ExcellentV.goodgoodfail)
Text Book : Basic Concepts andMethodology for the Health Sciences 10
Types of qualitative variables
Nominal
Ordinal
8/10/2019 Wayne Daniel Ppt
6/186
/04/34
*A population:
It is the largest collection of valuesof arandom variablefor which we have aninterest at a particular time.
For example:
The weights of all the children enrolled in acertain elementary school.
Populations may be finiteor infinite.
Text Book : Basic Concepts andMethodology for the Health Sciences 11
*A sample:It is a part of a population.
For example:
The weights of only a fraction of
these children.
Text Book : Basic Concepts andMethodology for the Health Sciences 12
8/10/2019 Wayne Daniel Ppt
7/186
/04/34
Exercises
Question (6)Page 17 Question (7)Page 17
Situation A , Situation B
Text Book : Basic Concepts andMethodology for the Health Sciences 13
Exercises:Q6: For each of the following variables
indicate whether it is quantitative orqualitative variable:
(a)The blood type of some patient in the
hospital. Qualitative Nominal(b) Blood pressure level of a patient.
(Qualitative ordinal)
Text Book : Basic Concepts andMethodology for the Health Sciences 14
8/10/2019 Wayne Daniel Ppt
8/186
/04/34
(c)Weights of babies born in a hospital
during a year. Quantitative continues(d) Gender of babies born in a hospital during a
year. Qualitative nominal
(e)The distance between the hospital to thehouseQuantitative continues
(f) Under-arm temperature of day-old infants
born in a hospital. Quantitativecontinues
Text Book : Basic Concepts andMethodology for the Health Sciences 15
Q7: For each of the following situations,answer questions a through d:
(a)What is the population?
(b)What is the sample in the study?
(c)What is the variable of interest?
(d)What is the type of the variable?
Situation A:A study of 300 households in asmall southern town revealed that if she hasschool-age child present.
Text Book : Basic Concepts andMethodology for the Health Sciences 16
8/10/2019 Wayne Daniel Ppt
9/186
/04/34
All households in a small(a) Population:
southern town.
300 households in a small(b) Sample:
southern town.
(c) Variable: Does households had school agechild present.
(d)Variable is qualitative nominal.
Text Book : Basic Concepts andMethodology for the Health Sciences 17
Situation B:A study of 250 patients admitted
to a hospital during the past year revealedthat, the patients lived from the hospital.
(a) Population:All patients admitted to ahospital during the past year.
(b) Sample: 250 patients admitted to a hospitalduring the past year.
Text Book : Basic Concepts andMethodology for the Health Sciences 18
8/10/2019 Wayne Daniel Ppt
10/186
/04/34
(c) Variable: Distancethe patient lived away
from the hospital
Variable is Quantitative continuous.(d)
Text Book : Basic Concepts andMethodology for the Health Sciences 19
Choose the right answer:1-The variable is a
a. subset of the population.
b. parameter of the population.
c. relative frequency.
d. characteristic of the population to be measured.
e. class interval.
2-Which of the following is an example of discrete variablea. the number of students taking statistics in this term at ksu.
b. the time to exercise daily.
c. whether or not someone has a disease
d. height of certain buildings
e. Level of education
8/10/2019 Wayne Daniel Ppt
11/186
/04/34
3.Which of the following is not an example of discrete
variable
a. the number of students at the class of statistics.b. the number of times a child cry in a certain street.
c. the time to run a certain distance.
d. the number of buildings in a certain street.
e. number of educated persons in a family.
4.Which of the following is an example of nominal
qualitative variable
a. blood pressure level.
b. the number of times a child brush his/her teeth.
c. whether or not someone fail in an exam.
d. Weight of babies at birth.
e. the time to run a certain distance.
5.The continuous variable is a
a. variable with a specific number of values.
b. variable which cant be measured.
c. variable takes on values within intervals.
d. variable with no mode.
e. qualitative variable.
6. which of the following is an example of continuous variable
a. The number of visitors of the clinic yesterday.
b. The time to finish the exam.
c. The number of patients suffering from certain disease.
d. Whether or not the answer is true.
8/10/2019 Wayne Daniel Ppt
12/186
/04/34
7. The discrete variable is
a-qualitative variable.
b-variable takes on values within interval.C-variable with a specific number of values.d-variable with no mode.
8-Which of the following is an example of nominalvariable :
a-age of visitors of a clinic.b-The time to finish the exam.
c-Whether or not a person is infected by influenza.
d-Weight for a sample of girls .
9-The nominal variable is a
a-A variable with a specific number of values
b-Qualitative variable that cant be ordered.
c-variable takes on values within interval.
d-Quantitative variable .
10-Which of the following is an example of
nominal variable :a-The number of persons who are injured in accident.
b-The time to finish the exam.
c-Whether or not the medicine is effective.
d-Socio-economic level.
8/10/2019 Wayne Daniel Ppt
13/186
/04/34
11-The ordinal variable is :
a-variable with a specific number of values.
b-variable takes on values within interval.
c-Qualitative variable that can be ordered.
d-Variable that has more than mode.
Chapter ( 2 )Strategies for understanding the
meanings of Data
Pages 19 27)
8/10/2019 Wayne Daniel Ppt
14/186
/04/34
Key words
frequency table, bar chart ,range
width of interval ,mid-intervalHistogram , Polygon
Text Book : Basic Concepts andMethodology for the Health Sciences 27
Descriptive StatisticsFrequency Distribution
for Discrete Random VariablesExample:
Suppose that we take sampleof size 16 from children in aprimary school and get thefollowing data about thenumber of their decayed teeth,
3,5,2,4,0,1,3,5,2,3,2,3,3,2,4,1
To construct a frequency table
We need three columns:
1.Variable name
2.Frequency (f):how manmber3.Relative frequency(R.f)=
Frequency / n
Relative
Frequenc
y
Frequenc
y
No. of
decayed
teeth
0.0625
0.1250.25
0.3125
0.125
0.125
1
24
5
2
2
0
12
3
4
5
116=nTotal
8/10/2019 Wayne Daniel Ppt
15/186
/04/34
Representing the simple frequency table
using the bar chart
Number of decayed teeth
5.004.003.002.001.00.00
6
5
4
3
2
1
0
22
5
4
2
1
Text Book : Basic Concepts andMethodology for the Health Sciences 29
We can represent theabove simplefrequency table usingthe bar chart.
We can get :
1. The sample size?
2. Number of childrenwith decayedteeth=2?
3. Relative frequencyof children withdecayed teeth =4?
2.3 Frequency Distributionfor Continuous Random Variables
For large samples, we cant use the simple frequency tableto represent the data.
We need to dividethe data into groupsor intervals orclasses.
So, we need to determine:
1- The number of intervals (k).
Too fewintervals are not good because information will belost.
Too manyintervals are not helpful to summarize the data.
Text Book : Basic Concepts andMethodology for the Health Sciences 30
8/10/2019 Wayne Daniel Ppt
16/186
/04/34
2- The range (R).
It is the difference between the largest and
the smallest observation in the dataset.[R=MaxMin]
3- The Width of the interval (w).
Classintervals generally should be of the same
width.
Text Book : Basic Concepts andMethodology for the Health Sciences 31
Frequency
(f)
Class interval
113039
464049
705059
456069
167079
18089
189Total
Text Book : Basic Concepts andMethodology for the Health Sciences 32
Sum of frequency=sample size=n
8/10/2019 Wayne Daniel Ppt
17/186
/04/34
Text Book : Basic Concepts andMethodology for the Health Sciences 33
Cumulative Frequency:The
It can be computed by adding successivefrequencies.
The Cumulative Relative Frequency:
It can be computed by adding successive relativefrequencies.
interval:-MidTheIt can be computed by adding the lower bound of
the interval plus the upper bound of it and thendivide over 2.[(Lower bound + Upper bound)/2]
For the above example, the
following table represents the
cumulative frequency, the
relative frequency, the
cumulative relative frequencyand the mid-interval.
Text Book : Basic Concepts andMethodology for the Health Sciences 34
8/10/2019 Wayne Daniel Ppt
18/186
/04/34
Cumulative
RelativeFrequency
Relative
FrequencyR.f
Cumulative
Frequency
Frequency
Freq (f)
Mid
interval
Class
interval
0.05820.0582111134.53039
-0.2434574644.54049
0.6720-127-54.55059
0.91010.2381-45-6069
0.99480.08471881674.57079
10.0053189184.58089
1189Total
Text Book : Basic Concepts andMethodology for the Health Sciences 35
R.f= freq/n
Example :
From the above frequency table, complete the tablethen answer the following questions:
1-The number of objects with age less than 50 years ?
2-The number of objects with age between 40-69 years?
3-Relative frequency of objects with age between 70-79
years ? 4-Relative frequency of objects with age more than 69
years ?
5-The percentage of objects with age between 40-49years ?
Text Book : Basic Concepts andMethodology for the Health Sciences 36
8/10/2019 Wayne Daniel Ppt
19/186
/04/34
6-The percentage of objects with age less than 60 years ?
7-The Range (R) ?
8- Number of intervals (K)?
9- The width of the interval ( W) ?
Text Book : Basic Concepts andMethodology for the Health Sciences 37
Representing the grouped frequency table using the histogram
To draw the histogram, the true classes limitsshould be used. They can becomputed by subtracting 0.5 from thelower limit and adding 0.5 to theupper limit for each interval.
FrequencyTrue class
limits
1129.5
8/10/2019 Wayne Daniel Ppt
20/186
/04/34
Representing the grouped frequency table
using the Polygon
0
10
20
30
40
50
60
7080
345 445 545 645 745 845
Text Book : Basic Concepts andMethodology for the Health Sciences 39
Q1] For a sample of patients, we obtain the following graph forapproximated hours spent without pain after a certain surgery
8/10/2019 Wayne Daniel Ppt
21/186
/04/34
1) The type of the graph is:
(e)Bar chart (f) polygon(d) histogram(c) line(a) curve (b) not known
2) The number of patients stayed the longest time without pain is:(f) 10(e) 15(d) 6(c) 5(b) 80(a) 1
3)The percent of patients spent 3.5 hours or more without pain is:
(f) 37.5%(e) 68.75%(d) 18.75%(c) 50%(a) 25% (b) 30%
4)The lowest number of hours spent without pain is:
(f))10(e) 1(d) 0.5(c) 5(b) 25 (a) 6.5
6)The mode equals
(f) 80(d) 3(d) 15(c) 2,4(b) 6(a) we can't find it
the following histogram show the frequency distribution of pathologic][H.Wtumor size ( in cm) for a sample of 110 cancer patients:
8/10/2019 Wayne Daniel Ppt
22/186
/04/34
1. The percent of cancer patients with approximate level of pathologic tumor size =2
cm is:
(a)
18% (b) 50% (c) 16.36% (d) 32.72% (e) 36% (f) 0%2. The number of cancer patients with lowest pathologic tumor size is:
(a) 0.5 (b) 3 (c) 11 (d) 13 (e) 15 (f) 24
3. The approximate size of pathologic tumorwith highest percentage of patients is:
(a) 3 (b) 1 (c) 36 (d) 110 (e) 32.72% (f) 16.36%
4. What the approximate value of the sample mean
(a) 18.33 (b) 36 (c) 1 (d) 1.75 (e) 1.586 (f) we can't find it
5. The mode equals
(a)
1 (b) 3 (c) 36 (d) 55 (e) 110 (f) we can't find it
6. The approximate value of the sample variance
(a) 3 (b) 0.6 (c) 0.774 (d) 1.586 (e) 110 (f) we can't find it
Exercises
Pages : 3134
Questions: 2.3.2(a) , 2.3.5 (a)
H.W. : 2.3.6 , 2.3.7(a)
Text Book : Basic Concepts andMethodology for the Health Sciences 44
8/10/2019 Wayne Daniel Ppt
23/186
/04/34
Exercises:
Q2.3.2:Janardhan et al. (A-2) conducted a study
in which they measured incidental intracranialaneurysms (IIAs) in 125 patients. Theresearchers examined post proceduralcomplications and concluded that IIAs can besafely treated without causing mortality andwith a lower complications rate than previouslyreported.
Text Book : Basic Concepts andMethodology for the Health Sciences 45
The following are the sizes (in millimeters) of the159 IIAs in the sample.
Text Book : Basic Concepts andMethodology for the Health Sciences 46
frequencyClass Interval
290-4
875-9
2610-14
1015-19
420-24
125-29
230-34
159Total
8/10/2019 Wayne Daniel Ppt
24/186
/04/34
(a)Use the frequency table to prepare:
* A relative frequency distribution* A cumulative frequency distribution
* A cumulative relative frequency distortion
Text Book : Basic Concepts andMethodology for the Health Sciences 47
(b) What percentage of the measurements arebetween 10 and 14 inclusive?
(c) How many observations are less than 20?
(d) What proportion of the measurements are
greater than orequal to 25?(e) What percentage of the measurements areeither less than 10 or greater than 19?
Text Book : Basic Concepts and
Methodology for the Health Sciences 48
8/10/2019 Wayne Daniel Ppt
25/186
/04/34
Q2.3.5:The following table shows the number of
hours 45 hospital patients slept following the
administration of a certainanesthetic.(a) From thesedata construct:* A relativefrequencydistribution
FrequencyClass Interval
211-5
166-10
611-15
216-20
45Total
Text Book : Basic Concepts andMethodology for the Health Sciences 49
(b) How many of the measurements are greaterthan 10? Ans: 8
(c) What percentage of the measurements arebetween 6-15 ?
Ans: 49%
(d) What proportion of the measurement is lessthan or equal 15? Ans: 0.96
Text Book : Basic Concepts andMethodology for the Health Sciences 50
8/10/2019 Wayne Daniel Ppt
26/186
/04/34
Q2.3.6:The following are the number of babies
born during a year in 60 community hospitals.
(a) From thesedata construct:
*A relative
frequency
distribution
*
Text Book : Basic Concepts andMethodology for the Health Sciences 51
FrequencyClass Interval
520-24
625-29
930-34
335-39
540-44
845-49
1150-54
1355-5960Total
Q2.3.7: In a study ofphysical endurance
levels of male college
freshman, the
following composite
endurance scores
based on severalexercise routines
were collected.
Text Book : Basic Concepts andMethodology for the Health Sciences 52
FrequencyClass interval
6115-134
7135-154
16155-174
31175-194
37195-214
28215-234
18235-254
8255-274
3275-2941295-314
155Total
8/10/2019 Wayne Daniel Ppt
27/186
/04/34
(a) From these data construct:
* A relative frequency distribution
Text Book : Basic Concepts andMethodology for the Health Sciences 53
) :4.2Section (
Descriptive Statistics
Measures of Central
Tendency41-38Page
8/10/2019 Wayne Daniel Ppt
28/186
/04/34
key words:
Descriptive Statistic, measure ofcentral tendency ,statistic, parameter,mean () ,median, mode.
Text Book : Basic Concepts andMethodology for the Health Sciences 55
The Statistic and The Parameter
A Statistic:It is a descriptive measure computed from the
data of a sample.
A Parameter:It is aa descriptive measure computed from
the data of a population.Since it is difficult to measure a parameter from the
population, a sampleis drawn of size n, whosevalues are
1 , 2 , ,n. From this data, wemeasure the statistic.
Text Book : Basic Concepts andMethodology for the Health Sciences 56
8/10/2019 Wayne Daniel Ppt
29/186
/04/34
Measures of Central Tendency
A measure of central tendency is a measure whichindicates where the middleof the data is.
The three most commonly used measures of central
tendency are:
The Mean, the Median, and the Mode.
The Mean :
It is the average of the data.
Text Book : Basic Concepts andMethodology for the Health Sciences 57
The Population Mean:
= which is usually unknown, then we use the
sample mean to estimate or approximate it.
The Sample Mean:=
Example:
Here is a random sample of size 10 of ages, where
1 = 42, 2 = 28, 3= 28, 4= 61, 5= 31,
6= 23, 7= 50, 8= 34, 9 = 32, 10 = 37.
= (42 + 28 + + 37) / 10 = 36.6x
x
Text Book : Basic Concepts andMethodology for the Health Sciences 58
1
N
ii
N
X
1
n
ii
n
x
8/10/2019 Wayne Daniel Ppt
30/186
/04/34
Properties of the Mean:
Uniqueness.For a given set of data there is one
and only one mean. Simplicity. It is easy to understand and to
compute.
Affected by extreme values. Since all valuesenter into the computation.
Example:Assume the values are 115, 110, 119, 117, 121 and
126. The mean = 118.
But assume that the values are 75, 75, 80, 80 and 280. Themean = 118, a value that is not representative of the set of
data as a whole.
Text Book : Basic Concepts andMethodology for the Health Sciences 59
The Median:
When orderingthe data, it is the observation that divide theset of observations into two equal partssuch that half of thedata are before it and the other are after it.
* If n is odd, the median will be the middle of observations. Itwill be the (n+1)/2 thordered observation.
When n = 11, then the median is the 6thobservation.
* If n is even, there are two middle observations. The medianwill be the mean of these two middle observations. It will be
the mean of the [ (n/2)th, (n/2 +1)th]ordered observation.When n = 12, then the median is the 6.5thobservation, which
is an observation halfway between the 6thand 7thorderedobservation.
Text Book : Basic Concepts andMethodology for the Health Sciences 60
8/10/2019 Wayne Daniel Ppt
31/186
/04/34
Example:
For the same random sample, the ordered
observations will be as:23, 28, 28, 31, 32, 34, 37, 42, 50, 61.
Since n = 10, then the median is the 5.5thobservation,i.e. = (32+34)/2 = 33.
Properties of the Median:
Uniqueness.For a given set of data there isone and only one median.
Simplicity. It is easy to calculate. It is not affected by extreme values as is
the mean.
Text Book : Basic Concepts andMethodology for the Health Sciences 61
The Mode:
It is the value which occurs most frequently.
If all values are different there is no mode.
Sometimes, there are more than one mode.
Example:
For the same random sample, the value 28 isrepeated two times, so it is the mode.
Properties of the Mode:
Sometimes, it is notunique.
It may be used fordescribing qualitative data.
It is not affected by extreme values
Text Book : Basic Concepts andMethodology for the Health Sciences 62
8/10/2019 Wayne Daniel Ppt
32/186
/04/34
Examples
Find the mean and the mode for thefollowing Relative Frequency?
Mode = 7(has the higher frequency)
Text Book : Basic Concepts andMethodology for the Health Sciences 63
n
xfx
Age(x)
frequenc
y(f)
X f
56710
2341
10182810
Total 10 66
6.610
66x
ExamplesFind the mean and the mode for the
following grouped
Frequency table?
Mode :interval( 79 )(can't give exact number only the interval
with higher Frequency)Text Book : Basic Concepts and
Methodology for the Health Sciences 64
n
xfx
4.71074 x
Age
Frequency
(f)
Midpoint
(X)
X f
1 - 34 - 67 - 9
10 - 12
2 1
4 3
2 5 8 11
4 5 32 33
Total
10
_
74
8/10/2019 Wayne Daniel Ppt
33/186
/04/34
Examples
Number of decayed teeth
5.004.003.002.001.00.00
6
5
4
3
2
1
0
22
5
4
2
1
Text Book : Basic Concepts andMethodology for the Health Sciences 65
Find the mean andthe mode for the
following bar
chart?
Solution :
Mode = 3
(has the higherfrequency)
16
)25()24()53()42()21()10(
225421
xxxxxxx
n
Text Book : Basic Concepts andMethodology for the Health Sciences 66
687.216
43x
n
xfx
8/10/2019 Wayne Daniel Ppt
34/186
/04/34
key words:
Descriptive Statistic, measure ofdispersion , range ,variance, coefficient ofvariation.
Text Book : Basic Concepts andMethodology for the Health Sciences 67
2.5. Descriptive StatisticsMeasures of Dispersion:
A measure of dispersion conveys information regarding theamount of variability present in a set of data.
Note:
1. If all the values are the same
There is no dispersion .2.If all the values are different
There is a dispersion:a).If the valuescloseto each other
The amount of Dispersion small.b)If the values are widely scattered
The Dispersion is greater.
Text Book : Basic Concepts andMethodology for the Health Sciences 68
8/10/2019 Wayne Daniel Ppt
35/186
/04/34
43Page1.5.2Ex. Figure
**Measures of Dispersion are :1.Range (R).
2.Variance.
3.Standard deviation.
4.Coefficient of variation (C.V).
Text Book : Basic Concepts andMethodology for the Health Sciences 69
1.The Range (R):
Range =Largest value- Smallest value =
Note: Range concern only onto two values Example 2.5.1 Page 40: Refer to Ex 2.4.2.Page 37 Data: 43,66,61,64,65,38,59,57,57,50. Find Range? Range=66-38=28
Text Book : Basic Concepts andMethodology for the Health Sciences 70
SL xx
8/10/2019 Wayne Daniel Ppt
36/186
/04/34
2.The Variance: It measure dispersion relative to the scatter of the values a
bout there mean.
a) Sample Variance ( ) :
,where is sample mean
Example 2.5.2 Page 40:
Refer to Ex 2.4.2.Page 37
Find Sample Variance of ages , = 56
Solution: S2= [(43-56) 2+(66-56) 2+..+(50-56) 2]/ (10-1)
= 810/9 = 90
x
x
Text Book : Basic Concepts andMethodology for the Health Sciences 71
2S
1
)(1
2
2
n
xx
S
n
i
i
b)PopulationVariance( ) :
where , is Population mean
3.The Standard Deviation:
is the square root of variance=
a) Sample Standard Deviation = S =
b)Population Standard Deviation = =
Text Book : Basic Concepts andMethodology for the Health Sciences 72
2
N
xN
i
i
1
2
2
)(
Varince
2S
2
8/10/2019 Wayne Daniel Ppt
37/186
/04/34
Note:
You can use the calculator to find : 1.Mean
2.Variance
3. Standard deviation
How to use calculator is in the Appendix))
Text Book : Basic Concepts andMethodology for the Health Sciences 73
4.The Coefficient of Variation(C.V):
Is a measure use to compare thedispersion in two sets of data which isindependent of the unit of themeasurement .
where S: Sample standard deviation.
: Sample mean.
Text Book : Basic Concepts andMethodology for the Health Sciences 74
)100(.
X
SVC
X
8/10/2019 Wayne Daniel Ppt
38/186
/04/34
:46Page3.5.2Example
Suppose two samples of human males yield thefollowing data:
Sampe1 Sample2
Age 25-year-olds 11year-olds
Mean weight 145 pound 80 pound
Standard deviation 10 pound 10 pound
Text Book : Basic Concepts andMethodology for the Health Sciences 75
We wish to know which is more variable.
Solution:
c.v (Sample1)= (10/145)*100= 6.9 %
c.v (Sample2)= (10/80)*100= 12.5 %
Then age of 11-years old(sample2) is more variation
Text Book : Basic Concepts andMethodology for the Health Sciences 76
8/10/2019 Wayne Daniel Ppt
39/186
/04/34
Exercises
Pages : 5253 Questions: 2.5.1 , 2.5.2 ,2.5.3
H.W. : 2.5.4 , 2.5.5, 2.5.6, 2.5.14 * Also you can solve in the review questions
page 57:
Q: 12,13,14,15,16, 19
Text Book : Basic Concepts andMethodology for the Health Sciences 77
Exercises:For each of the data sets in the following
exercises Use calculator if possible))compute:(a) The mean
(b) The median
(c) The mode
(d) The range
(e) The variance(f) The standard deviation
(g) The coefficient of variation
Text Book : Basic Concepts andMethodology for the Health Sciences 78
8/10/2019 Wayne Daniel Ppt
40/186
/04/34
Q2.5.3:
Butz et al. (A-10) evaluated the duration of benefit
derived from the use of noninvasive positive-pressure ventilation by patients with amyotrophiclateral sclerosis on symptoms, quality of life, andsurvival. One of the variables of interest is partialpressure of arterial carbon dioxide (PaCO2). Thevalues below ( mm of Hg ) reflect the result of
baseline testing on 30 subjects as established by
arterial blood gas analyses.
Text Book : Basic Concepts andMethodology for the Health Sciences 79
seven rat pups from the experiment involving the carotidartery.
500 570 560 570 450 560 570
(a) The mean (b) The median
Ans: 540 Ans: 560
(c) The mode (d) The range
Ans: 570 Ans: 120(e) The variance (f) The standard deviation
Ans: 2200 Ans: 46.90
(g) The coefficient of variation Ans: 8.69%
Text Book : Basic Concepts andMethodology for the Health Sciences 80
8/10/2019 Wayne Daniel Ppt
41/186
/04/34
H.W :Q2.5.1:
Porcellini et al. (A-8) studied 13 HIV-positive
patients who were treated with highly activeantiretroviral therapy (HAART) for at least 6months. The CD4 T cell counts ( ) atbaseline for the 13 subjects are listed below.
230 205 313 207 227 245 173
58 103 181 105 301 169
Text Book : Basic Concepts andMethodology for the Health Sciences 81
l106
(a) The mean (b) The median
Ans: 193.6 Ans: 207
(c) The mode (d) The range
Ans: no mode Ans: 255
(e) The variance (f) The standard deviation
Ans: 5568.09 Ans: 74.62
(g) The coefficient of variation Ans: 38.543%
Text Book : Basic Concepts andMethodology for the Health Sciences 82
8/10/2019 Wayne Daniel Ppt
42/186
/04/34
Q2.5.2:H.W Shrair and Jasper (A-9) investigatedwhether decreasing the venous return in young
rats would affect ultrasonic vocalizations(USVs). Their research showed no significantchange in the number of ultrasonicvocalizations when blood was removed fromeither the superior vena cava or the carotidartery. Another important variable measuredwas the heart rate (bmp) during the withdrawal
of blood. The data below presents the heartrate of
Text Book : Basic Concepts andMethodology for the Health Sciences 83
40.0 47.0 34.0 42.0 54.0 48.0 53.6 56.958.0 45.0 54.5 54.0 43.0 44.3
53.9 41.8 33.0 43.1 52.4 37.9 34.5
40.1 33.0 59.9 62.6 54.1 45.7 40.6 56.659.0
(a) The mean (b) The median
Ans: 47.41 Ans: 46.35(c) The mode (d) The rangeAns: 33, 54 Ans: 29.6
Text Book : Basic Concepts andMethodology for the Health Sciences 84
8/10/2019 Wayne Daniel Ppt
43/186
/04/34
(e) The variance (f) The standard deviation
Ans: 46.53 Ans: 8.75
(g) The coefficient of variation
(H.W)Q2.5.4:
According to Starch et al. (A-11), hamstring tendongrafts have been the weak link in anteriorcruciate ligament reconstruction. In a controlledlaboratory study, they compared two techniquesfor reconstruction : either an interference screwor a central sleeve and
Text Book : Basic Concepts andMethodology for the Health Sciences 85
screw on the tibial side. For eight cadavericknees, the measurements below represent therequired force ( in Newtones) at which initialfailure of graft strands occurred for thecentral sleeve and screw technique.
172.5 216.63 212.62 98.97 66.95239.76 19.57 195.72
(a) The mean (b) The medianAns: 152.84 Ans: 184.11
(c) The mode (d) The range
Ans: no mode Ans: 220.19
Text Book : Basic Concepts and
Methodology for the Health Sciences 86
8/10/2019 Wayne Daniel Ppt
44/186
/04/34
(e) The variance (f) The standard deviation
Ans: 6494.732 Ans: 80.5899
(g) The coefficient of variation Ans: 52.73%Q2.5.5:
Cardosi et al. (A-12) performed a 4 yearsretrospective review of 102 women undergoingradical hysterectomy for cervical or endometrialcancer. Catheter-associated urinary tract infection
was observed in 12 of the subjects. Below are thenumbers of
Text Book : Basic Concepts andMethodology for the Health Sciences 87
postoperative days until diagnosis of the infection foreach subject experiencing an infection.
16 10 49 15 6 15 8 19 11 22 13 17
(a) The mean (b) The median
Ans: 16.75 Ans: 15(c) The mode (d) The range
Ans: 15 Ans: 43
(e) The variance (f) The standard deviationAns: 124.0227 Ans: 11.1365
(g) The coefficient of variation Ans: 66.49%
Text Book : Basic Concepts andMethodology for the Health Sciences 88
8/10/2019 Wayne Daniel Ppt
45/186
/04/34
Q2.5.6:The purpose of a study by Nozama et al. (A-13) was to evaluate the outcome of surgical repair
of pars interarticularis defect by segmental wirefixation in young adults with lumbar spondylolysis.The authors found that segmental wire fixationhistorically has been successful in the treatment ofnonathletes with spondylolysis, but no informationexisted on the results of this type of surgery inathletes. In a retrospective study, the authorsfound 20 subjects who had the surgery between
1993 and 2000. For these subjects, the data below
Text Book : Basic Concepts andMethodology for the Health Sciences 89
represent the duration in months of follow-up careafter the operation.
103 68 62 60 60 54 49 44 42 41 3836 34 30 19 19 19 19 17 16
(a) The mean (b) The median
Ans: 41.5 Ans: 39.5
(c) The mode (d) The rangeAns: 19 Ans: 87
(e) The variance (f) The standard deviation
Ans: 490.264 Ans: 22.1419
Text Book : Basic Concepts andMethodology for the Health Sciences 90
8/10/2019 Wayne Daniel Ppt
46/186
/04/34
(g) The coefficient of variation Ans: 53.35%
Q2.5.14:
In a pilot study, Huizinga et al. ( A-14) wanted togain more insight into the psychosocialconsequences for children of a parent with cancer.For the study, 14 families participated insemistructured interviews and completedstandardized questionnaires. Below is the age ofthe sick parent with cancer (in years) for the 14families.
Text Book : Basic Concepts andMethodology for the Health Sciences 91
37 48 53 46 42 49 44 38 32 32 51 51 48 41
(a) The mean (b) The medianAns: 43.7143 Ans: 45
(c) The mode (d) The range
Ans: 32, 51 Ans: 21(e) The variance (f) The standard deviation
Ans: 48.0659 Ans: 6.93296(g) The coefficient of variation Ans: 15.8597%
Text Book : Basic Concepts andMethodology for the Health Sciences 92
8/10/2019 Wayne Daniel Ppt
47/186
/04/34
3Chapter
Probability
The Basis of the Statistical
inference
Key words:
Probability, objective Probability,
subjective Probability, equally likely
Mutually exclusive, multiplicative ruleConditional Probability, independent events, Bayes
theorem
Text Book : Basic Concepts andMethodology for the Health Sciences 94
8/10/2019 Wayne Daniel Ppt
48/186
/04/34
Introduction1.3
The concept of probability is frequently encountered ineveryday communication. For example, a physician maysay that a patient has a 50-50 chance of surviving a certainoperation.
Another physician may say that she is 95 percent certainthat a patient has a particular disease.
Most people express probabilities in terms of percentages.
But, it is more convenient to express probabilities asfractions. Thus, we may measure the probability of theoccurrence of some event by a number between 0 and 1.
The more likely the event, the closer the number is to one.An event that can't occur has a probability of zero, and anevent that is certain to occur has a probability of one.
Text Book : Basic Concepts andMethodology for the Health Sciences 95
Some definitions of:2.3probability
1.Equally likely outcomes:
Are the outcomes that have the same chance ofoccurring.
2.Mutually exclusive:
Two events are said to be mutually exclusive ifthey cannot occur simultaneously such that
A B =.
Text Book : Basic Concepts andMethodology for the Health Sciences 96
8/10/2019 Wayne Daniel Ppt
49/186
/04/34
The universal Set(S): The set all possibleoutcomes.
The empty set : Contain no elements. The event ,E: is a set of outcomes in S which has a
certain characteristic.
Classical Probability: If an event can occur in Nmutually exclusive and equally likely ways, and if m ofthese possess a triat, E, the probability of theoccurrence of event E is equal to m/ N .
P(A)=n(A)/n(S)
For Example: in the rolling of the die , each of the
six sides is equally likely to be observed . So, theprobability that a 4 will be observed is equal to 1/6.
Text Book : Basic Concepts andMethodology for the Health Sciences 97
Relative Frequency Probability:
Def:If some posses is repeated a large number oftimes, n, and if some resulting event E occurs m times, the relative frequency of occurrence of E , m/n willbe approximately equal to probability of E .
P(E) = m/n .
Text Book : Basic Concepts andMethodology for the Health Sciences 98
8/10/2019 Wayne Daniel Ppt
50/186
/04/34
Elementary Properties of3.3:Probability
Given some process (or experiment )with n mutually exclusive events E1, E2,E3,, En, then
1-P(Ei) 0, i= 1,2,3,n 2- P(E1 )+ P(E2) ++P(En)=1
3- P(Ei+EJ)=P(Ei )+ P(EJ )
Ei,EJare mutually exclusive
Text Book : Basic Concepts andMethodology for the Health Sciences 99
Rules of Probability1-Addition Rule P(A U B)= P(A) + P(B)P (AB )
2- If A and B are mutually exclusive (disjoint) ,then P (AB ) = 0 Then , addition rule is
P(A U B)= P(A) + P(B) . 3- Complementary Rule P(A' )= 1P(A) where, A' = = complement event Consider example 3.4.1 Page 63
Text Book : Basic Concepts andMethodology for the Health Sciences 100
8/10/2019 Wayne Daniel Ppt
51/186
/04/34
Table 3.4.1 in Example 3.4.1TotalLater >18
(L)Early = 18
(E)
Family history of
Mood Disorders
633528Negative(A)
573819Bipolar
Disorder(B)
854441Unipolar (C)
1136053Unipolar and
Bipolar(D)318177141Total
Text Book : Basic Concepts andMethodology for the Health Sciences 101
Find the following probabilities:
1. P(C) =
2.P(L) =
(=3.P(AcE) =3.P(A
4.P(L U D)=
B) =5.P(AE) =6.P(L
(=Ac7.P(E
(=U Ac8.P(L
Text Book : Basic Concepts andMethodology for the Health Sciences 102
A
8/10/2019 Wayne Daniel Ppt
52/186
/04/34
Answer the following questions:**
Suppose we pick a person at random from this sample.1-The probability that this person will be 18-years old?
2-The probability that this person has family history of moodorders Unipolar(C)?
3-The probability that this person has no family history of moodorders Unipolar( )?
4-The probability that this person is 18-years old or has no familyhistory of mood orders Unipolar (C))?
5-The probability that this person is more than18-years old andhas family history of mood orders Unipolar and Bipolar(D)?
Text Book : Basic Concepts andMethodology for the Health Sciences 103
C
Conditional Probability:
P(A\B) is the probability of A assuming that B has
happened.
P(A\B)= , P(B) 0
P(B\A)= , P(A) 0
)(
)(
)(
)(
BP
BAPor
Bn
BAn
)(
)(
)(
)(
AP
BAPor
An
BAn
Text Book : Basic Concepts andMethodology for the Health Sciences 104
8/10/2019 Wayne Daniel Ppt
53/186
/04/34
64Page2.4.3ExampleFrom previous example 3.4.1 Page 63 , answer
suppose we pick a person at random and find he is 18
years (E),what is the probability that this person will
be one with Negative family history of mood
disorders (A)?
suppose we pick a person at random and find he has
family history of mood (D) what is the probability that
this person will be 18 years (E)?
Text Book : Basic Concepts andMethodology for the Health Sciences 105
:Calculating a joint Probability
Example 3.4.3.Page 64
Suppose we pick a person at randomfrom the 318 subjects. Find theprobability that he will early (E) and has
Negative family history of mooddisorders (A).
Exercise: Example 3.4.5.Page 66
Exercise: Example 3.4.6.Page 67
Text Book : Basic Concepts andMethodology for the Health Sciences 106
8/10/2019 Wayne Daniel Ppt
54/186
/04/34
Independent Events:
If A has no effect on B, we said that A,Bare independent events.
Then,
1- P(AB)= P(B)P(A) 2- P(A\B)=P(A)
3- P(B\A)=P(B)
Text Book : Basic Concepts andMethodology for the Health Sciences 107
68Page7.4.3Example In a certain high school class consisting of 100
students60 , 40 of them are boys, it is observed that24 girls and 16 boys wear eyeglasses . If a student ispicked at random from this class
1.The probability that the student wears eyeglasses ,
2.What is the probability that a student picked at
random wears eyeglasses given that the student is aboy?
3.What is the probability of the joint occurrence ofthe events of wearing eye glasses and being a girl?
Text Book : Basic Concepts andMethodology for the Health Sciences 108
8/10/2019 Wayne Daniel Ppt
55/186
/04/34
SolutionTake the following symbols:
not a boy = girl:B : is boy. BC
not were glasses:G:were glasses GC
n(s)=100 ,n(B)=40,n(GB)=16,
n(GBC)=24
Text Book : Basic Concepts andMethodology for the Health Sciences 109
TotalBCB
n(G)n(BCG)
n(BG)G
n(GC
)n(BC
GC
)n(BGC
)GC
n(S)(n(BCn(B)Total
1.P(G)=
=2.P(G/B)
(=3.P(GBC
TotalBCB
2416G
GC
10040Total
Text Book : Basic Concepts andMethodology for the Health Sciences 110
8/10/2019 Wayne Daniel Ppt
56/186
/04/34
69Page8.4.3Example
Suppose that of 1200 admission to a general
hospital during a certain period of time,750 are
private admissions. If we designate these as a set A,
then compute P(A) , P( ).
Exercise: Example 3.4.9.Page 76
A
Text Book : Basic Concepts andMethodology for the Health Sciences 111
Marginal Probability: Definition: Given some variable that can be broken down into m
categories designatedby and another jointly occurring variable
that is broken down into n categories designated by, the marginal probability of with all the categories of
B . That is,
for all value of j Example 3.4.9.Page 76 Use data of Table 3.4.1, and rule of marginal
Probabilities to calculate P(E).,P(L),P(A),..
Text Book : Basic Concepts andMethodology for the Health Sciences 112
),()( jii BAPAP
mi AAAA ,.......,,.......,, 21
nj BBBB ,.......,,.......,, 21
iA
8/10/2019 Wayne Daniel Ppt
57/186
/04/34
Exercise:
Page 76-77 Questions :
3.4.1, 3.4.3,3.4.4
H.W.
3.4.5 , 3.4.7
Text Book : Basic Concepts andMethodology for the Health Sciences 113
Q3.4.1: In a study of violent victimization of womenand men, Porcelli et al. (A-2) collected informationfrom 679 women and 345 men aged 18 to 64years at several family practice centers in themetropolitan Detroit area. Patients filled out ahealth history questionnaire that included aquestion about victimization. The following tableshows the sample subjects cross-classified by sex
and type of violent victimization reported. Thevictimization categories are defined as novictimization, partner victimization (and not byothers), victimization by persons other than
Text Book : Basic Concepts and
Methodology for the Health Sciences 114
8/10/2019 Wayne Daniel Ppt
58/186
/04/34
partners (friends, family members, or strangers), andthose who reported multiple victimization.
(a) Suppose we pick a subject at random from this
group. What is the probability that this subject willbe a women?
Text Book : Basic Concepts andMethodology for the Health Sciences 115
TotalMultipleVictimization
(T)
Nonpartners
(N)
Partners
(P)
NoVictimization
(V)
679181634611Women(W)
345101710308Men(M)
1024283344919Total
(b) What do we call the probability calculated inpart a?
(c) Show how to calculate the probability asked forin part a by two additional methods.
(d) If we pick a subject at random, what isprobability that the subject will be a women andhave experienced partner abuse?
(e) What do we call the probability calculated inpart d?
(f) Suppose we picked a man at random. Knowingthis information, what is the probability that he
Text Book : Basic Concepts andMethodology for the Health Sciences 116
8/10/2019 Wayne Daniel Ppt
59/186
/04/34
experienced abuse from nonpartners?
(g) What do we call the probability calculated in
part f?(h) Suppose we pick a subject at random. Whatis the probability that it is a man or someonewho experienced abuse from a partner?
(i) What do we call the method by which youobtained the probability in part h?
Text Book : Basic Concepts andMethodology for the Health Sciences 117
H.WQ3.4.3: Fernando et al. (A-3) studied drug-sharingamong injection drug users in the South Bronx inNew York City. Drug users in New York City usethe term split a bag or get down on a bag torefer to the practice of diving a bag of heroin orother injectable substances. A common practiceincludes splitting drugs after they are dissolved ina common cooker, a procedure with considerable
HIV risk. Although this practice is common, little isknown about the prevalence of such practices. Theresearchers asked injection drug users in fourneighborhoods in the South Bronx if they ever
Text Book : Basic Concepts andMethodology for the Health Sciences 118
8/10/2019 Wayne Daniel Ppt
60/186
/04/34
got down on drugs in bags or shots. The resultsclassified by gender and splitting practice are
given below:State the
following
probabilities in
words and calculate:
(a) Ans: 0.3418
(b) Ans: 0.8746(c) Ans: 0.6134
Text Book : Basic Concepts and
Methodology for the Health Sciences 119
TotalNever SplitDrugs
Split DrugsGender
673324349Male
348128220Female
1021452569Total
)( DrugsSplitMaleP
)( DrugsSplitMaleP
)( DrugsSplitMaleP
(d) Ans: 0.6592
Q3.4.4: Laveist and Nuru-Jeter (A-4) conducteda study to determine if doctor-patient raceconcordance was associated with greatersatisfaction with care. Toward that end, theycollected a national sample of African-
American, Caucasian, Hispanic, and Asian-American respondents. The following tableclassifies the race of the subjects as well as therace of their physician:
Text Book : Basic Concepts and
Methodology for the Health Sciences 120
)(MaleP
8/10/2019 Wayne Daniel Ppt
61/186
/04/34
(a) What is the probability that a randomly
selected subject will have an Asian/Pacific-Islander physician? Ans: 0.1533
Text Book : Basic Concepts andMethodology for the Health Sciences 121
Patient Race
TotalAsian-American
HispanicAfrican-American
CaucasianPhysiciansRace
1796175406436779White19651516214African-
American
16621281719Hispanic
417203717568Asian/Pacific-Island
1454565530Other
2720389676745910Total
(b) What is the probability that an African-Americansubject will have an African- American physician?
Ans: 0.2174
(c) What is the probability that a randomly selectedsubject in the study will be Asian-American andhave an Asian/Pacific-Islander physician?Ans: 0.075
(d) What is the probability that a subject chosen atrandom will be Hispanic or have a Hispanicphysician? Ans: 0.2625
(e) Use the concept of complementary events tofind the probability that a subject chosen at
Text Book : Basic Concepts andMethodology for the Health Sciences 122
8/10/2019 Wayne Daniel Ppt
62/186
/04/34
random in the study does not have a whitephysician? Ans: 0.3397
Q3.4.5:
If the probability of left-handedness in a certaingroup of people is 0.5, what is the probabilityof right-handedness (assuming noambidexterity)?
Text Book : Basic Concepts andMethodology for the Health Sciences 123
Q3.4.6:The probability is 0.6 that a patient selected atrandom from the current residents of acertain hospital will be a male. The probabilitythat the patient will be a male who is in forsurgery is 0.2. A patient randomly selectedfrom current residents is found to be a male;
what is the probability that the patient is inthe hospital for surgery?
Ans: 0.3333
Text Book : Basic Concepts andMethodology for the Health Sciences 124
8/10/2019 Wayne Daniel Ppt
63/186
/04/34
Q3.4.7:
In a certain population of hospital patients the
probability is 0.35 that a randomly selectedpatient will have heart disease. The probabilityis 0.86 that a patient with heart disease is asmoker. What is the probability that a patientrandomly selected from the population will bea smoker and have heart disease?
Ans: 0.301
Text Book : Basic Concepts andMethodology for the Health Sciences 125
Text Book : Basic Concepts andMethodology for the Health Sciences 126
Baye's TheoremPages 79-83
8/10/2019 Wayne Daniel Ppt
64/186
/04/34
In this case if the patient has to do ablood test in the laboratory,some time the result is
(he has the disease) and ifPositivenegativethe result is
(he doesn't has the disease)
Text Book : Basic Concepts andMethodology for the Health Sciences 127
So, we have the following cases
The patient has the
disease(D)
The patient doesn't has
the disease( D )
Lab result is
Negative(T)
wrong result
Specificity
A symptom
P(T|D)
Lab result is
positive
( T )
Sensitivity
A symptom
P(T|D)
wrong result
Text Book : Basic Concepts andMethodology for the Health Sciences 128
8/10/2019 Wayne Daniel Ppt
65/186
/04/34
Text Book : Basic Concepts andMethodology for the Health Sciences 129
Definition.1
The sensitivity of the symptom
This is the probability of a positive result given that the
subject has the disease. It is denoted by P(T|D)
2Definition.
The specificity of the symptomThis is the probability of negative result given that the
subject does not have the disease. It is denoted by
P(T|D)
:3Definition
The predictive value positive of the symptom
This is the probability that the subject has thedisease given that the subject has a positivescreening test result.
It is calculated using bayes theorem through thefollowing formula
Where P(D) is the rate of the disease
Text Book : Basic Concepts and
Methodology for the Health Sciences 130
)()|()()|()()|()|(
DPDTPDPDTPDPDTPTDP
8/10/2019 Wayne Daniel Ppt
66/186
/04/34
Which is given by
P(D) = 1P(D)P(T/ D) = 1 - P(T/ D)
the numerator is equal to sensitivityNote thattimes rate of the disease, while thedenominator is equal to sensitivity times rateof the disease plus 1 minus the specificity
times one minus the rate of the disease
Text Book : Basic Concepts andMethodology for the Health Sciences 131
Text Book : Basic Concepts andMethodology for the Health Sciences 132
Definition 4
The predictive value negative of the symptom
This is the probability that a subject does not have the
disease given that the subject has a negative
screening test result .It is calculated using Bayes
Theorem through the following formula
where,
)()|()()|(
)()|()|(
DPDTPDPDTP
DPDTPTDP
)|(1)|( DTPDTp
8/10/2019 Wayne Daniel Ppt
67/186
/04/34
Text Book : Basic Concepts andMethodology for the Health Sciences 133
Example 3 5 1 page 82
A medical research team wished to evaluate a proposed screening test for
Alzheimers disease. The test was given to a random sample of 450 patients
with Alzheimers disease and an independent random sample of 500 patients
without symptoms of the disease. The two samples were drawn from
populations of subjects who were 65 yearsor older. The results are as follows.
Test Result Yes (D) No ( ) Total
Positive(T) 436 5 441
Negativ( ) 14 495 509
Total 450 500 950
T
D
Text Book : Basic Concepts andMethodology for the Health Sciences 134
In the context of this example
a)What is a false positive?
A false positive is when the test indicates a positive result (T) when
the person does not have the disease
b) What is the false negative?
A false negative is when a test indicates a negative result ( )
when the person has the disease (D).
c) Compute the sensitivity of the symptom.
d) Compute the specificity of the symptom.
D
T
9689.0450
436)|( DTP
99.0500
495)|( DTP
8/10/2019 Wayne Daniel Ppt
68/186
/04/34
Text Book : Basic Concepts andMethodology for the Health Sciences 135
e) Suppose it is known that the rate of the disease in the general population
is 11.3%. What is the predictive value positive of the symptom and the
predictive value negative of the symptomThe predictive value positive of the symptom is calculated as
The predictive value negative of the symptom is calculated as
996.0.113)(0.0311)(087)(0.99)(0.8
87)(0.99)(0.8
)()|()()|(
)()|()|(
DPDTPDPDTP
DPDTPTDP
925.00.113)-(.01)(1.113)(0.9689)(0
.113)(0.9689)(0
)()|()()|(
)()|()|(
DPDTPDPDTP
DPDTPTDP
Exercise:
Page 83
Questions :
3.5.1, 3.5.2
H.W.:
Page 87 : Q4,Q5,Q7,Q9,Q21
Text Book : Basic Concepts andMethodology for the Health Sciences 136
8/10/2019 Wayne Daniel Ppt
69/186
/04/34
Q3.5.1;A medical research team wishes toassess the usefulness of a certain symptom
(call it S) in the diagnosis of a particulardisease. In a random sample of 775 patientswith the disease, 744 reported having thesymptom. In an independent random sampleof 1380 subjects without the disease, 21reported that they had the symptom.
(a) In the context of this exercise, what is a false
positive?(b) What is a false negative?
Text Book : Basic Concepts and
Methodology for the Health Sciences 137
(c) Compute the sensitivity of the symptom.
(d) Compute the specificity of the symptom.
(e) Suppose it is known that the rate of the diseasesin the general population is 0.001. what is thepredictive value positive of the symptom?
(f) What is the predictive value negative of thesymptom?
Text Book : Basic Concepts andMethodology for the Health Sciences 138
8/10/2019 Wayne Daniel Ppt
70/186
/04/34
(h) What do you conclude about the predictivevalue of the symptom on the basis of the results
obtained in part g?Q3.5.2:
Dorsay and Helms (A-6) performed aretrospective study of 71 knees scanned by MRI.One of the indicators they examined was theabsence of the bow-tie sign in the MRI asevidence of a bucket-handle or bucket-handle
type tear of the meniscus.
Text Book : Basic Concepts andMethodology for the Health Sciences 139
In the study, surgery confirmed that 43 of the71 cases were bucket-handle tears. The casesmay be cross-classified by bow-tie signstatus and surgical results as follows:
Text Book : Basic Concepts andMethodology for the Health Sciences 140
TotalTear SurgicallyConfirmed As Not
Present ( )
Tear SurgicallyConfirmed (D)
481038Positive Test(absent bow-tie sign)
(T)
23185Negative Test(bow-tie present)( )
712843Total
D
T
8/10/2019 Wayne Daniel Ppt
71/186
/04/34
(a) What is the sensitivity of testing to see if theabsent bow-tie sign indicates a meniscal tear?
Ans: 0.8837(b) What is the specificity of testing to see if theabsent bow-tie sign indicates a meniscal tear?
Ans: 0.6229
(c) What additional information would you need todetermine the predictive value of the test?
Text Book : Basic Concepts andMethodology for the Health Sciences 141
(d) Suppose it is known that the rate of thedisease in the general population is 0.1, what isthe predictive value positive of the symptom?Ans: 0.20659
(e) What is predictive value negative of the
symptom? Ans: 0.9797
Text Book : Basic Concepts andMethodology for the Health Sciences 142
8/10/2019 Wayne Daniel Ppt
72/186
/04/34
Chapter 4:Probabilistic features of
certain data Distributions
Pages 93- 111
Key words
Probability distribution , random variable ,
Bernolli distribution, Binomail distribution,
Poisson distribution
Text Book : Basic Concepts andMethodology for the Health Sciences 144
8/10/2019 Wayne Daniel Ppt
73/186
/04/34
The Random Variable (X):
When the values of a variable (height, weight,or age) cant be predicted in advance, thevariable is called a random variable.
An example is the adult height.
When a child is born, we cant predict exactly
his or her height at maturity.
Text Book : Basic Concepts andMethodology for the Health Sciences 145
4.2 Probability Distributions forDiscrete Random Variables
Definition:
The probability distribution of a discreterandom variable is a table, graph, formula,or other device used to specify all
possible values of a discrete randomvariable along with their respectiveprobabilities.
Text Book : Basic Concepts andMethodology for the Health Sciences 146
8/10/2019 Wayne Daniel Ppt
74/186
/04/34
The Cumulative Probability
Distribution of X, F(x):
It shows the probability that the variable
X is less than or equal to a certain value,
F(x)=P(X x).
Text Book : Basic Concepts andMethodology for the Health Sciences 147
Example 4.2.1 page 94:
F(x)=
P(X x)
P(X=x)
(f/n)=
Frequeny
(f)
Number ofPrograms
0.20880.2088621
0.36700.1582472
0.49830.1313393
0.62960.13133940.82490.1953585
0.94950.1246376
0.96300.013547
1.00000.0370118
1.0000297TotalText Book : Basic Concepts and
Methodology for the Health Sciences 148
8/10/2019 Wayne Daniel Ppt
75/186
/04/34
Properties of probability distribution
of discrete random variable. 1. 0 P(X = x) 1
2. P(X = x) =1
3. P(a X b)=P(X b)- P(X a-1)
4. P(X < b)= P(X b-1)
Text Book : Basic Concepts andMethodology for the Health Sciences 149
Example 4.2.2 page 96: (use table inexample 4.2.1)
What is the probability that a randomlyselected family will be one who usedthree assistance programs?
Example 4.2.3 page 96: (use table in
example 4.2.1)
What is the probability that a randomlyselected family used either one or twoprograms?
Text Book : Basic Concepts andMethodology for the Health Sciences 150
8/10/2019 Wayne Daniel Ppt
76/186
/04/34
Example 4.2.4 page 98: (use table inexample 4.2.1)
What is the probability that a family picked at
random will be one who used two or fewerassistance programs? Example 4.2.5 page 98: (use table in
example 4.2.1)What is the probability that a randomlyselected family will be one who used fewerthan four programs?
Example 4.2.6 page 98: (use table inexample 4.2.1)
What is the probability that a randomly selectedfamily used five or more programs?
Text Book : Basic Concepts andMethodology for the Health Sciences 151
Example 4.2.7 page 98: (use table in
example 4.2.1)
What is the probability that a randomlyselected family is one who used betweenthree and five programs, inclusive?
Text Book : Basic Concepts andMethodology for the Health Sciences 152
8/10/2019 Wayne Daniel Ppt
77/186
/04/34
Find the following probability
For the following probability distribution table:
P(X= 15) =
P(X = 10)=
P( X = 11)=
P(X 10)=
P( 5 X 15)=
P(X 10.5) =
Mean =
Text Book : Basic Concepts andMethodology for the Health Sciences
153
P(X=x)X
0.235
0.3010
-------15
0.4420
For the following probability distribution table:
Find:
P(X 6) = P(X < 5)=
P( X = 7)=
P(X =5)=
P(X > 4)=
P( 4 X 6)= Mean=
Text Book : Basic Concepts andMethodology for the Health Sciences 154
P(X x)X
0.103
0.254
0.555
0.606
0.78718
8/10/2019 Wayne Daniel Ppt
78/186
/04/34
The Binomial Distribution:3.4 The binomial distribution is one of the most
widely encountered probability distributions inapplied statistics. It is derived from a process
known as a Bernoulli trial.
Bernoulli trial is:
When a random process or experiment called a
trial can result in only one of two mutually
exclusive outcomes, such as dead or alive, sick
or well, the trial is called a Bernoulli trial.
Text Book : Basic Concepts andMethodology for the Health Sciences 155
The Bernoulli Process A sequence of Bernoulli trials forms a Bernoulli
process under the following conditions
1- Each trial results in one of two possible, mutuallyexclusive, outcomes. One of the possible outcomesis denoted (arbitrarily) as a success, and the other isdenoted a failure.
2- The probability of a success, denoted by p, remainsconstant from trial to trial. The probability of afailure, 1-p, is denoted by q.
3- The trials are independent, that is the outcome ofany particular trial is not affected by the outcome ofany other trial
Text Book : Basic Concepts andMethodology for the Health Sciences 156
8/10/2019 Wayne Daniel Ppt
79/186
/04/34
The probability distribution of the binomial
random variableX, the number of successes in
nindependenttrials is:
Where is the number of combinations of ndistinct objects taken x ofthemata time.
* Note: 0! =1
Text Book : Basic Concepts andMethodology for the Health Sciences 157
( ) ( ) , 0,1,2,....,X n X
nf x P X x p q x n
x
n
x
!
!( )!
n n
x n xx
! ( 1)( 2)....(1)x x x x
Properties of the binomial distribution
1. f(x) 0 2. f(x) =1 3.The parameters of the binomial
distribution are n and p
4. = E(X)= n p 5.2= Var(X)= npq
Text Book : Basic Concepts andMethodology for the Health Sciences 158
8/10/2019 Wayne Daniel Ppt
80/186
/04/34
Example 4.3.1 page 100 If we examine all birth records from the North
Carolina State Center for Health statistics for year2001, we find that 85.8 percent of the pregnancieshad delivery in week 37 or later (full- term birth).
If we randomly selected five birth records from thispopulation what is the probability that exactly threeof the records will be for full-term births?
Exercise: example 4.3.2 page 104
Text Book : Basic Concepts andMethodology for the Health Sciences 159
Example 4.3.3 page 104 Suppose it is known that in a certain
population 10 percent of the population iscolor blind. If a random sample of 25 people isdrawn from this population, find the probabilitythat
a) Two or fewer will be color blind.
b) 24 or more will be color blindc) Between two and five inclusive will be color
blind.
d) At most one will be color blind.
Exercise: example 4.3.4 page 106
Text Book : Basic Concepts andMethodology for the Health Sciences 160
8/10/2019 Wayne Daniel Ppt
81/186
/04/34
The Poisson Distribution4.4 If the random variable X is the number of
occurrences of some random event in a certainperiod of time or space (or some volume of matter).
The probability distribution of X is given by:
f (x) =P(X=x) = ,x = 0,1,..
The symbol e is the constant equal to 2.7183.(Lambda) is called the parameter of the distributionand is the average number of occurrences of the
random event in the interval (or volume)
!
x
x
e
Text Book : Basic Concepts andMethodology for the Health Sciences 161
Properties of the Poisson distribution
1. f(x) 0 2. f(x) =1 3.The parameters of the poisson
distribution is
4. = E(X)= 5.2= Var(X)=
Text Book : Basic Concepts andMethodology for the Health Sciences 162
8/10/2019 Wayne Daniel Ppt
82/186
/04/34
Example 4.4.1 page 111
In a study of a drug -induced anaphylaxis amongpatients taking rocuronium bromide as part oftheir anesthesia, Laake and Rottingen foundthat the occurrence of anaphylaxis followed aPoisson model with =12 incidents per yearin Norway .Find
1- The probability that in the next year, among
patients receiving rocuronium, exactly threewill experience anaphylaxis?
Text Book : Basic Concepts andMethodology for the Health Sciences 163
2- The probability that less than two patientsreceiving rocuronium, in the next year willexperience anaphylaxis?
3- The probability that more than two patientsreceiving rocuronium, in the next 6-month
will experience anaphylaxis?
4- The expected value of patients receivingrocuronium, in the next year who will experienceanaphylaxis.
5- The variance of patients receiving rocuronium,in the next year who will experience anaphylaxis
6- The standard deviation of patients receivingrocuronium, in the next year who will experienceanaphylaxis
Text Book : Basic Concepts andMethodology for the Health Sciences 164
8/10/2019 Wayne Daniel Ppt
83/186
/04/34
Example 4.4.2 page 111: Refer to
example 4.4.1
1-What is the probability that at least threepatients in the next year will experienceanaphylaxis if rocuronium is administered withanesthesia?
2-What is the probability that exactly one patientin the next month will experience anaphylaxis ifrocuronium is administered with anesthesia?
3-What is the probability that none of thepatients in the next month will experienceanaphylaxis if rocuronium is administered withanesthesia?
Text Book : Basic Concepts andMethodology for the Health Sciences 165
4-What is the probability that at mosttwo patients in the next year willexperience anaphylaxis if rocuronium isadministered with anesthesia?
Exercises: examples 4.4.3, 4.4.4 and
4.4.5 pages111-113 Exercises: Questions 4.3.4 ,4.3.5,
4.3.7 ,4.4.1,4.4.5
Text Book : Basic Concepts andMethodology for the Health Sciences 166
8/10/2019 Wayne Daniel Ppt
84/186
/04/34
Excercices:
111Page:4.3.4QThe same survey data base cited shows that 32 percent ofU.S adults indicated that they have been tested for HIV atsome points in their life .Consider a simple random sampleof 15 adults selected at that time .Find the probability
that the number of adults who have been
tested for HIV in the sample would be:
Text Book : Basic Concepts andMethodology for the Health Sciences 167
Hint:
( ) ( ) , 0,1,2,....,X n X
nx P X x p q x n
x
Text Book : Basic Concepts andMethodology for the Health Sciences 168
8/10/2019 Wayne Daniel Ppt
85/186
/04/34
(a) Three (Ans. 0.1457)
(b) Less than two (Ans. 0.02477)
(c ) At most one (Ans. 0.02477)
(d) At least three (Ans. 0.9038)
(e) between three and five ,inclusive.
Text Book : Basic Concepts and
Methodology for the Health Sciences 169
Q4.3.5
refer to Q4.3.4 , find the mean and thevariance?
(Answer: mean = 4.8 ,
variance =3.264 )
Text Book : Basic Concepts andMethodology for the Health Sciences 170
8/10/2019 Wayne Daniel Ppt
86/186
/04/34
Q 4.4.3 :If the mean number of serious accidents per year
in a large factory is five ,find the probability thatthe current year there will be:
Hint: f(x)=
per yeara) Exactly seven accidents(
per yearb) Ten or more accidents(months6perc) No accident(
.per yeard)fewer than one accidents(
Text Book : Basic Concepts andMethodology for the Health Sciences 171
!x
e x
For Q 4.4.3:Q4.4.4
Find:
months-6per.mean1
2. variance and standard
months3deviation per
Text Book : Basic Concepts andMethodology for the Health Sciences 172
8/10/2019 Wayne Daniel Ppt
87/186
/04/34
4.5 Continuous
Probability Distribution
Pages 114127
Key words:
Continuous random variable, normaldistribution , standard normal distribution
, T-distribution
Text Book : Basic Concepts andMethodology for the Health Sciences 174
8/10/2019 Wayne Daniel Ppt
88/186
/04/34
Now consider distributions of
continuous random variables.
Text Book : Basic Concepts andMethodology for the Health Sciences 175
Properties of continuousprobability Distributions:
1- Area under the curve = 1.
2- P(X = a) = 0 , where ais a constant.
3- Area between two points a , b
= P(a
8/10/2019 Wayne Daniel Ppt
89/186
/04/34
4.6 The normal distribution:
It is one of the most important probabilitydistributions in statistics.
The normal density is given by
, - < x < , - 0
, e : constants
: population mean.
: Population standard deviation.
Text Book : Basic Concepts andMethodology for the Health Sciences 177
2
2
2
)(
2
1)(
x
exf
Characteristics of the normal distribution:Page 111
The following are some important characteristicsof the normal distribution:
1- It is symmetrical about its mean, .2- The mean, the median, and the mode are all equal.
3- The total area under the curve above the x-axis isone.
4-The normal distribution is completely determinedby the parameters and .
Text Book : Basic Concepts andMethodology for the Health Sciences 178
8/10/2019 Wayne Daniel Ppt
90/186
/04/34
5- The normal distribution
depends on the two
parameters
and
.
determines thelocation of
the curve.
(As seen in figure 4.6.3) ,
But,
determines
the scale of the curve, i.e.
the degree of flatness or
peaked ness of the curve.(as seen in figure 4.6.4)
Text Book : Basic Concepts andMethodology for the Health Sciences 179
1 2 3
1< 2< 3
1
2
3
1< 2< 3
The Standard normal distribution:
Is a special case of normal distribution with
mean equal 0 and a standard deviation of 1.
The equation for the standard normal
distribution is written as
, - < z <
Text Book : Basic Concepts andMethodology for the Health Sciences 180
2
2
2
1)(
z
ezf
8/10/2019 Wayne Daniel Ppt
91/186
/04/34
Characteristics of the standard
normal distribution
1- It is symmetrical about 0.2- The total area under the curve above thex-axis is one.
3- We can use table (D) in the Appendix tofind the probabilities and areas.
Text Book : Basic Concepts andMethodology for the Health Sciences 181
How to use tables of ZNote that
The cumulative probabilities P(Z z)are given in
tables for -3.49 < z < 3.49. Thus,
P (-3.49 < Z < 3.49) 1.
For standard normal distribution,
P (Z > 0) = P (Z < 0) = 0.5
Example 4.6.1:
If Z is a standard normal distribution, then
1) P( Z < 2) =0.9772
is the area to the left to 2
and it equals 0.9772.
Text Book : Basic Concepts andMethodology for the Health Sciences 182
2
8/10/2019 Wayne Daniel Ppt
92/186
/04/34
Example 4.6.2:
P(-2.55 < Z < 2.55)is the area between
-2.55 and 2.55, Then it equals
P(-2.55 < Z < 2.55)=0.99460.0054
= 0.9892.
Example 4.6.2:
P(-2.74 < Z < 1.53)is the area between
-2.74 and 1.53.
P(-2.74 < Z < 1.53)=0.93700.0031
= 0.9339.
Text Book : Basic Concepts andMethodology for the Health Sciences 183
-2.74 1.53
-2.55 2.550
Example 4.6.3:P(Z > 2.71)is the area to the right to 2.71.So,
P(Z > 2.71)=10.9966 = 0.0034.
Example :
P(Z = 0.84)is the area at z = 0.84.
So,
P(Z = 0.84)= 0
Text Book : Basic Concepts andMethodology for the Health Sciences 184
0.84
2.71
8/10/2019 Wayne Daniel Ppt
93/186
/04/34
Exercise
Given Standard normal distribution by usingthe tables :
2:The area to the left of Z=1.6.4
:2.6.4
The area under the curve Z =0, Z= 1.43
)=55.0: P(Z 3.6.4
)=35.2-: P(Z
8/10/2019 Wayne Daniel Ppt
94/186
/04/34
1Given the following probabilities, find z11.6.4
P(Z z1) = 0.0055 (z1=-2.54)12.6.4
P(-2.67Z z1) = 0.9718 (z1=1.97)13.6.4
P(Z > z1) = 0.0384 (z1=1.77):11.6.4
P(z1 < Z 2.98) = 0.1117 (z1=1.21)
Text Book : Basic Concepts andMethodology for the Health Sciences 187
How to transform normal distribution(X) to standard normal distribution
(Z)?
This is done by the following formula:
Example:
If X is normal with = 3, = 2. Find the value of
standard normal Z, If X= 6?
Answer:
Text Book : Basic Concepts andMethodology for the Health Sciences 188
x
z
5.12
36
xz
8/10/2019 Wayne Daniel Ppt
95/186
/04/34
4.7 Normal Distribution Applications
The normal distribution can be used to model the distribution of
many variables that are of interest. This allow us to answer
probability questions about these random variables.
Example 4.7.1:
The Uptime is a custom-made light weight battery-operated
activity monitor that records the amount of time an individual
spend the upright position. In a study of children ages 8 to 15
years. The researchers found that the amount of time children
spend in the upright position followed a normal distribution withMean of 5.4 hours and standard deviation of 1.3.Find
Text Book : Basic Concepts andMethodology for the Health Sciences 189
If a child selected at random ,then1-The probability that the child spend less than 3hours in the upright position 24-hour period
P( X < 3) = P( < ) = P(Z < -1.85) = 0.0322
-------------------------------------------------------------------------
2-The probability that the child spend more than 5hours in the upright position 24-hour period
P( X > 5) = P( > ) = P(Z > -0.31)
= 1- P(Z < - 0.31) = 1- 0.3520= 0.648
-----------------------------------------------------------------------3-The probability that the child spend exactly 6.2hours in the upright position 24-hour period
P( X = 6.2) = 0
X
3.1
4.53
Text Book : Basic Concepts andMethodology for the Health Sciences 190
X
3.1
4.55
8/10/2019 Wayne Daniel Ppt
96/186
/04/34
4-The probability that the child spend from 4.5 to
7.3 hours in the upright position 24-hour period
P( 4.5 < X < 7.3) = P( < < )
= P( -0.69 < Z < 1.46 ) = P(Z
8/10/2019 Wayne Daniel Ppt
97/186
/04/34
Exercises
oldyears-29For another subject (:1.7.4Qmale) in the study by Diskin, aceton level were
normally distributed with mean of 870 and standard
deviation of 211 ppb. Find the probability that in a
given day the subjects acetone level is :
(a) between 600 and 1000 ppb
(b) over 900 ppb
(c ) under 500 ppb (d) At 700 ppb
Text Book : Basic Concepts andMethodology for the Health Sciences 193
In the study of fingerprints an important:2.7.4Qquantitative characteristic is the total ridge count for
the 10 fingers of an individual . Suppose that the total
ridge counts of individuals in a certain population are
approximately normally distributed with mean of 140
and a standard deviation of 50 .Find the probability
that an individual picked at random from this
population will have ridge count of :
(a) 200 or more
(Answer :0.0985)
Text Book : Basic Concepts andMethodology for the Health Sciences 194
8/10/2019 Wayne Daniel Ppt
98/186
/04/34
(b) less than 200 (Answer :0.8849)
(c) between 100 and 200
(Answer :0.6982)
(d) between 200 and 250
(Answer :0.0934)
Text Book : Basic Concepts andMethodology for the Health Sciences 195
The T Distribution:
)7367
1- It has mean of zero.
2- It is symmetric about the
mean.3- It ranges from -to .
Text Book : Basic Concepts andMethodology for the Health Sciences 196
0
8/10/2019 Wayne Daniel Ppt
99/186
/04/34
4- compared to the normal distribution, the t
distribution is less peaked in the center andhas higher tails.
5- It depends on the degrees of freedom (n-1).
6- The t distribution approaches the standardnormal distribution as (n-1) approaches .
7- Can find values of t in the table (E) in
the Appendix.
Text Book : Basic Concepts andMethodology for the Health Sciences 197
Examples
t (7, 0.975) = 2.3646
------------------------------
t (24, 0.995) = 2.7696
--------------------------
If P (T(18)> t) = 0.975,
then t = -2.1009-------------------------
If P (T(22)< t) = 0.99,
then t = 2.508
Text Book : Basic Concepts andMethodology for the Health Sciences 198
0.005
t(24, 0.995)
0.995
t(7, 0.975)
0.0250.975
t
0.9750.025
0.99
0.01
t
8/10/2019 Wayne Daniel Ppt
100/186
/04/34
0
Find :
t0.95,10
= 1.8125
---------------------------------
t 0.975,18 = 2.1009
---------------------------------
t 0.01,20 = - 2.528
---------------------------------
t 0.10,29 = - 1.311
---------------------------------
Text Book : Basic Concepts andMethodology for the Health Sciences 199
Sampling4.6Distributions:
Definition:1.4.6
The probability distribution of a statistic is called a
sampling distribution.
Sampling Distributions of1.1.4.6
Means:
2
2( ) , ( ) (3)X X
E X V Xn
8/10/2019 Wayne Daniel Ppt
101/186
/04/34
1
Definition:2.4.6
Central Limit Theorem:1.2.4.6
If is the mean of a random sample of size ntaken from a population with mean and finitevariance , then the limiting form of the
distribution of:
is approximately the standard normal
distribution;
(4)/
XZ as n
n
X
2
( ) ~ (0,1) ( ) , ( ) ,
~ ( , ) (5)
XZ N E X V X
n n
X Nn
:Example
An electrical firm manufactures light bulbs that have a
length of life that is approximately normally
distributed with mean equal to 800hours and a
standard deviation of 40hours. Find :
probability that a random sample of 16
bulbs will have an average life of lessthan 775hours.
8/10/2019 Wayne Daniel Ppt
102/186
/04/34
2
:Solution
Let Xbe the length of life and is theaverage life;
X
16, 800, 40
775 800( 775) ( ( 2.5) 0.0062
40/ 16
n
P X P Z P Z
Sampling Distribution of the sampleProportion:
Let X= no. of elements of type Ain the
sample
P= population proportion = no. of elements
of type A in the population / N = sample proportion= no. of elements of
type A in the sample / n = x/n
p
8/10/2019 Wayne Daniel Ppt
103/186
/04/34
3
3. For large n, we have:
~ ( , )
~ (0,1)
pqp N p
n
p pZ N
pq
n
~ ( , ) ( ) , ( )
1. ( ) ( )
2. ( ) ( ) , 1
x binomial n p E x np V x npq
xE p E pn
x pqV p V q p
n n
Example:
If X is Binomial distribution with n=10,P=0.1, find P
Solution:
Text Book : Basic Concepts andMethodology for the Health Sciences 206
)2.0( P
85314.0)05.1(
)009.0
1.0()
10
)9.0)(1.0(
1.02.0()2.0(
009.0
10
)9.0)(1.0()(
1.0)(
ZP
ZP
n
pq
PPPPP
n
pqPV
PPE
8/10/2019 Wayne Daniel Ppt
104/186
/04/34
4
6Chapter
Using sample data to make
estimates about population
)172-162parameters (P
Key words:
Point estimate, interval estimate, estimator,
Confident level ,, Confident interval for mean ,
Confident interval for two means,
Confident interval for population proportion P,
Confident interval for two proportions
Text Book : Basic Concepts andMethodology for the Health Sciences 208
8/10/2019 Wayne Daniel Ppt
105/186
/04/34
5
6.1 Introduction: Statistical inference is the procedure by which we reach toa conclusion about a population on the basis of the
information contained in a sample drawn from thatpopulation.
To any parameter, we can compute two types of estimate:a point estimate and an interval estimate.
Apoint estimateis a single numerical value used toestimate the corresponding population parameter.
Note:
Point estimate for ( ) is
Point estimate for ( ) is S
Text Book : Basic Concepts andMethodology for the Health Sciences 209
X
8/10/2019 Wayne Daniel Ppt
106/186
/04/34
6
Definition:
An interval estimateconsists of two
numerical values defining a range of valuesthat, with a specified degree ofconfidence, we feel includes theparameter being estimated.
Text Book : Basic Concepts andMethodology for the Health Sciences 211
aConfidence Interval for2.6Population Mean: (C.I)
Suppose researchers wish to estimate the mean ofsome normally distributed population.
They draw a random sample of size n from thepopulation and compute , which they use as a pointestimate of .
Because random sampling involves chance, thencant be expected to be equal to .
The value of may be greater than or less than.
It would be much more meaningful to estimate byan interval.
x
Text Book : Basic Concepts andMethodology for the Health Sciences 212
x
8/10/2019 Wayne Daniel Ppt
107/186
/04/34
7
percent confidence interval-1The:(C.I.) for
We want to find two values L :Lower bound andU:Upper bound between which lies with highprobability, i.e.
P( L U ) = 1-
Text Book : Basic Concepts andMethodology for the Health Sciences 213
For example:
When,
= 0.01,
then 1- =
= 0.05,
then 1- =
= 0.05,then 1- =
Text Book : Basic Concepts andMethodology for the Health Sciences 214
8/10/2019 Wayne Daniel Ppt
108/186
/04/34
8
For example:
When, (1- )100%:Level of confident
(1- )100% = 90%,
then =
99%
then =
80%
then =
Text Book : Basic Concepts andMethodology for the Health Sciences 215
(1- )100% confident interval for the mean : When the value of sample size (n):
population is normal or not normal population is normal
( n 30 ) (n< 30)
is known is not k