REVIEW OF STATISTICAL METHODS
HE MISIRICOMMUNITY HEALTH DEPARTMENT
• Describe how you will conduct descriptive statistical analysis in your study
• Describe how you will conduct hypothesis testing in your study (when applicable)
• Describe the statistical tests you will use to analyse data from your proposed study
Random Error
• Research is usually conducted on samples.• It is expensive, time-consuming and
logistically difficult to conduct a census.• Sample estimates will always be unexact
because of sampling error also known as random variation.
• The smaller the sample the greater the variation.
Types of Data
• Categorical data-from categorical variables like eye colour,sex, marital status, level of education etc
• A categorical variable has categories. Eg Sex is categorised as Male or Female.
• Continuous variables assume any value on the real line.
• Continuous data is from continuous variables
Scales of measurement
• Nominal: Sex• Ordinal: Severity of pain• Interval/Ratio: Weight, Speed
Describing data
Data can be described by using:• Charts• Tables• Numerical summary values• Shapes of distributions
1. Charts-Histogram
Pie Chart
2. Tables-Frequency distributionAge group Number of patients
< 30 30
31-40 102
41-50 162
51-60 96
61-70 22
71-80 4
Total 416
Percentage distributionSatisfaction with nursing care
No of patients Percentage
Very satisfied 121 25.5
Satisfied 161 33.9
Neutral 90 18.9
Dissatisfied 51 10.7
Very dissatisfied 52 10.9
TableSex Mean Age(SD)
Males 20.3(1.2)
Females 18.2(1.6)
All 19.3(1.8)
Table from Misiri et al(2012b)
HIV Rates-Misiri et al(2012a)
4. Shapes of distributions:Symmetry and kurtosis
The degree of “peakedness”(Chris Caple,1991) is called kurtosis
• Positively skewed
• Negatively skewed
Positively skewed
Symmetric
Kurtosis
Variation in sample data
Numerical summaries
• For categorical data one uses numbers/frequencies ,percentages or proportions, rates to describe data.
• For continuous data one uses measures of central tendency and variation
3. Numerical values- Summary statistics
Examples of summary statistics are:A. Measures of central tendency:
Mean,Median,Mode
B. Measures of variation:Variance,standard deviation,range,interquartile range
C. Other statistics: Proportion, Percentiles, etc
Examples-categorical data
• In a class of 200 students, 51 are males and 149 are females.-Numbers.
• 25.5% of patients were very satisfied with nursing care
• The prevalence of Chlamydia in young women in England in 1996 was 3.1%.
• The incidence rate of cancer is 90 cases per 100,000 person years of time
Moe examples:et al(2012b)
Ze & Misiri(2009)
Descriptive statistics-Categorical data
• Proportions
• Percentages
• Each proportion should have a CI
• Better summarized in a percentage distribution or frequency distribution
Appropriate average to use
• Use the mean and standard deviation for symmetric data.
• Use the median and range or quartiles for skewed data.
Misiri et al(2012c)
Standard deviation
• SD=sqrt(44.8/4) =3.3
• This is the average variation in the data.
• That means the difference between individual data points and the sample mean is on average 3.3.
• A normal distribution is a distribution that is symmetric and looks similar to a bell in shape. If distribution of the data in a population follows a normal distribution (the measure of spread around the mean) then:
• The range covered by 1 SD below and 1 above the mean includes 68% of the distribution.
• The range covered by 2 SDs below and 2 above the mean includes 95% of the distribution.
• The range covered by 3 SDs below and 3 above the mean includes 99.7% of the distribution.
• The standard deviation is not used for the scatter around the median. The measure for the scatter around the median is the INTER-QUARTILE RANGE. There are three quartiles: at 25%, 50% and 75%. They divide the data into four quarters in a similar way to the median (the 50%-ile) dividing it into two halves. The inter-quartile range is the range of values between the 25%-ile and the 75%-ile. These values are used in producing a box (and whisker)-plot.
Bell-shaped distribution
• The standard deviation is not used for the scatter around the median. The measure for the scatter around the median is the INTER-QUARTILE RANGE.
• There are three quartiles: at 25%, 50% and 75%. They divide the data into four quarters in a similar way to the median (the 50%-ile) dividing it into two halves. The inter-quartile range is the range of values between the 25%-ile and the 75%-ile. These values are used in producing a box (and whisker)-plot.
Example:Plasma glucose
• 4.67• 4.97• 5.11• 5.17• 5.33• 6.22• 6.50• 7.00
Hypothesis Testing
• Null• Alternative• Type I Error• Type II Error• Level of significance
Example
• Null hypothesis: mothers attending ANC at clinic A are as likely to be attended by a skilled birth attendant as mothers attending ANC at clinic B
• Alternative hypothesis: mothers attending ANC at clinic A are either more likely or less likely to be attended by a skilled birth attendant as mothers attending ANC at clinic B.
Paired samples t-test
• See example
Independent sample t-test
• See example
• P-value is the probability that the statistic is as observed from your sample or even more extreme.
Example:• If Ho: Mean Difference=0• Ha: Mean Difference >0• The test statistic is Z• Given that the level of significance is 5%:
• We will reject Ho if p-value < 5%• This is so because this implies that our
findings are less likely to have happened by chance.
• We will accept Ho if the p-value > 5%• This is so because this implies that our
findings are more likely to happen as stated in the Ho.
WARNING!
• Do not abuse p-values• P-values should always be accompanied by
confidence intervals.• Confidence intervals give the magnitude of
the effect as well as the precision of estimation.
Example:Zverev & Misiri(2009)
• One-way analysis of variance revealed a significant effect of shift phase on total sleep duration (F = 36.8, d.f. = 8, P < 0.000).
• Ho:The mean total sleep duration of the three shift phases are equal.Ha:The mean total sleep duration for the three shift phases are different.
Summary of methods