296
Stats 244.3(02) Review

Stats 244.3(02) Review. Summarizing Data Graphical Methods

Embed Size (px)

Citation preview

  • Slide 1
  • Stats 244.3(02) Review
  • Slide 2
  • Summarizing Data Graphical Methods
  • Slide 3
  • Histogram Stem-Leaf Diagram Grouped Freq Table Box-whisker Plot
  • Slide 4
  • Summary Numerical Measures
  • Slide 5
  • Measure of Central Location 1.Mean Center of gravity 2.Median middle observation
  • Slide 6
  • Measure of Non-Central Location 1.Percentiles 2.Quartiles 1.Lower quartile (Q 1 ) (25 th percentile) (lower mid-hinge) 2.median (Q 2 ) (50 th percentile) (hinge) 3.Upper quartile (Q 3 ) (75 th percentile) (upper mid-hinge)
  • Slide 7
  • Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation
  • Slide 8
  • 1.Range R = Range = max - min 2.Inter-Quartile Range (IQR) Inter-Quartile Range = IQR = Q 3 - Q 1
  • Slide 9
  • The Sample Variance Is defined as the quantity: and is denoted by the symbol
  • Slide 10
  • The Sample Standard Deviation s Definition: The Sample Standard Deviation is defined by: Hence the Sample Standard Deviation, s, is the square root of the sample variance.
  • Slide 11
  • Interpretations of s In Normal distributions Approximately 2/3 of the observations will lie within one standard deviation of the mean Approximately 95% of the observations lie within two standard deviations of the mean In a histogram of the Normal distribution, the standard deviation is approximately the distance from the mode to the inflection point
  • Slide 12
  • s Inflection point Mode
  • Slide 13
  • s 2/3 s
  • Slide 14
  • 2s
  • Slide 15
  • Computing formulae for s and s 2 The sum of squares of deviations from the the mean can also be computed using the following identity:
  • Slide 16
  • Then:
  • Slide 17
  • Slide 18
  • A quick (rough) calculation of s The reason for this is that approximately all (95%) of the observations are between and Thus
  • Slide 19
  • The Pseudo Standard Deviation (PSD) Definition: The Pseudo Standard Deviation (PSD) is defined by:
  • Slide 20
  • Properties For Normal distributions the magnitude of the pseudo standard deviation (PSD) and the standard deviation (s) will be approximately the same value For leptokurtic distributions the standard deviation (s) will be larger than the pseudo standard deviation (PSD) For platykurtic distributions the standard deviation (s) will be smaller than the pseudo standard deviation (PSD)
  • Slide 21
  • Measures of Shape
  • Slide 22
  • Skewness Kurtosis
  • Slide 23
  • Skewness based on the sum of cubes Kurtosis based on the sum of 4 th powers
  • Slide 24
  • The Measure of Skewness
  • Slide 25
  • The Measure of Kurtosis
  • Slide 26
  • Interpretations of Measures of Shape Skewness Kurtosis g 1 > 0g 1 = 0 g 1 < 0 g 2 < 0 g 2 = 0 g 2 > 0
  • Slide 27
  • Inferential Statistics Making decisions regarding the population base on a sample
  • Slide 28
  • Estimation by Confidence Intervals Definition An (100) P% confidence interval of an unknown parameter is a pair of sample statistics (t 1 and t 2 ) having the following properties: 1. P[t 1 < t 2 ] = 1. That is t 1 is always smaller than t 2. 2. P[the unknown parameter lies between t 1 and t 2 ] = P. the statistics t 1 and t 2 are random variables Property 2. states that the probability that the unknown parameter is bounded by the two statistics t 1 and t 2 is P.
  • Slide 29
  • Confidence Interval for a Proportion
  • Slide 30
  • The sample size that will estimate p with an Error Bound B and level of confidence P = 1 is: where: B is the desired Error Bound z is the /2 critical value for the standard normal distribution p* is some preliminary estimate of p. Determination of Sample Size
  • Slide 31
  • Confidence Intervals for the mean of a Normal Population,
  • Slide 32
  • The sample size that will estimate with an Error Bound B and level of confidence P = 1 is: where: B is the desired Error Bound z is the /2 critical value for the standard normal distribution s* is some preliminary estimate of s. Determination of Sample Size
  • Slide 33
  • Hypothesis Testing An important area of statistical inference
  • Slide 34
  • Definition Hypothesis (H) Statement about the parameters of the population In hypothesis testing there are two hypotheses of interest. The null hypothesis (H 0 ) The alternative hypothesis (H A )
  • Slide 35
  • Type I, Type II Errors 1.Rejecting the null hypothesis when it is true. (type I error) 2.accepting the null hypothesis when it is false (type II error)
  • Slide 36
  • Decision Table showing types of Error H 0 is TrueH 0 is False Correct Decision Type I Error Type II Error Accept H 0 Reject H 0
  • Slide 37
  • To define a statistical Test we 1.Choose a statistic (called the test statistic) 2.Divide the range of possible values for the test statistic into two parts The Acceptance Region The Critical Region
  • Slide 38
  • To perform a statistical Test we 1.Collect the data. 2.Compute the value of the test statistic. 3.Make the Decision: If the value of the test statistic is in the Acceptance Region we decide to accept H 0. If the value of the test statistic is in the Critical Region we decide to reject H 0.
  • Slide 39
  • Probability ofhe two types of error Definitions: For any statistical testing procedure define = P[Rejecting the null hypothesis when it is true] = P[ type I error] = P[accepting the null hypothesis when it is false] = P[ type II error]
  • Slide 40
  • Determining the Critical Region 1.The Critical Region should consist of values of the test statistic that indicate that H A is true. (hence H 0 should be rejected). 2.The size of the Critical Region is determined so that the probability of making a type I error, , is at some pre-determined level. (usually 0.05 or 0.01). This value is called the significance level of the test. Significance level = P[test makes type I error]
  • Slide 41
  • To find the Critical Region 1.Find the sampling distribution of the test statistic when is H 0 true. 2.Locate the Critical Region in the tails (either left or right or both) of the sampling distribution of the test statistic when is H 0 true. Whether you locate the critical region in the left tail or right tail or both tails depends on which values indicate H A is true. The tails chosen = values indicating H A.
  • Slide 42
  • 3.the size of the Critical Region is chosen so that the area over the critical region and under the sampling distribution of the test statistic when is H 0 true is the desired level of =P[type I error] Sampling distribution of test statistic when H 0 is true Critical Region - Area =
  • Slide 43
  • The z-tests Testing the probability of success Testing the mean of a Normal Population
  • Slide 44
  • The Alternative Hypothesis H A The Critical Region Critical Regions for testing the probability of success, p
  • Slide 45
  • The Alternative Hypothesis H A The Critical Region Critical Regions for testing mean, of a normal population
  • Slide 46
  • You can compare a statistical test to a meter Value of test statistic Acceptance Region Critical Region Critical Region Critical Region is the red zone of the meter
  • Slide 47
  • Value of test statistic Acceptance Region Critical Region Critical Region Accept H 0
  • Slide 48
  • Value of test statistic Acceptance Region Critical Region Critical Region Reject H 0
  • Slide 49
  • Acceptance Region Critical Region Sometimes the critical region is located on one side. These tests are called one tailed tests.
  • Slide 50
  • Whether you use a one tailed test or a two tailed test depends on: 1.The hypotheses being tested (H 0 and H A ). 2.The test statistic.
  • Slide 51
  • If only large positive values of the test statistic indicate H A then the critical region should be located in the positive tail. (1 tailed test) If only large negative values of the test statistic indicate H A then the critical region should be located in the negative tail. (1 tailed test) If both large positive and large negative values of the test statistic indicate H A then the critical region should be located both the positive and negative tail. (2 tailed test)
  • Slide 52
  • Usually 1 tailed tests are appropriate if H A is one-sided. Two tailed tests are appropriate if H A is two - sided. But not always
  • Slide 53
  • The p-value approach to Hypothesis Testing
  • Slide 54
  • Definition Once the test statistic has been computed form the data the p-value is defined to be: p-value = P[the test statistic is as or more extreme than the observed value of the test statistic when H 0 is true] more extreme means giving stronger evidence to rejecting H 0
  • Slide 55
  • Properties of the p -value 1.If the p-value is small (
  • The approximate test for a comparing two means of Normal Populations (unequal variances) Null HypothesisAlt. HypothesisCritical Region H 0 : 1 = 2 H 0 : 1 2 t t H 0 : 1 > 2 t > t H 0 : 1 < 2 t < -t Test statistic
  • Slide 93
  • Confidence intervals for the difference in two means of normal populations (small samples, unequal variances) (1 )100% confidence limits for 1 2 with
  • Slide 94
  • The paired t-test An example of improved experimental design
  • Slide 95
  • The matched pair experimental design (The paired sample experiment) Prior to assigning the treatments the subjects are grouped into pairs of similar subjects. Suppose that there are n such pairs (Total of 2n = n + n subjects or cases), The two treatments are then randomly assigned to each pair. One member of a pair will receive treatment 1, while the other receives treatment 2. The data collected is as follows: (x 1, y 1 ), (x 2,y 2 ), (x 3,y 3 ),, , (x n, y n ). x i = the response for the case in pair i that receives treatment 1. y i = the response for the case in pair i that receives treatment 2.
  • Slide 96
  • Let x i = the measurement of the response for the subject in pair i that received treatment 1. Let y i = the measurement of the response for the subject in pair i that received treatment 2. x1y1x1y1 The data x2y2x2y2 x3y3x3y3 xnynxnyn
  • Slide 97
  • To test H 0 : 1 = 2 is equivalent to testing H 0 : d = 0. (we have converted the two sample problem into a single sample problem). The test statistic is the single sample t-test on the differences d 1, d 2, d 3, , d n namely df = n - 1
  • Slide 98
  • Testing for the equality of variances The F test
  • Slide 99
  • The test statistic (F) The sampling distribution of the test statistic If the Null Hypothesis (H 0 ) is true then the sampling distribution of F is called the F-distribution with 1 = n - 1 degrees in the numerator and 2 = m - 1 degrees in the denominator
  • Slide 100
  • The F distribution 1 = n - 1 degrees in the numerator 2 = m - 1 degrees in the denominator F ( 1, 2 )
  • Slide 101
  • (Two sided alternative) Reject H 0 if or Critical region for the test:
  • Slide 102
  • Reject H 0 if Critical region for the test (one tailed): (one sided alternative)
  • Slide 103
  • Summary of Tests
  • Slide 104
  • One Sample Tests p = p 0 p > p 0 p p 0 p < p 0
  • Slide 105
  • Two Sample Tests
  • Slide 106
  • Two Sample Tests - continued SituationTest statisticH0H0 HAHA Critical Region Two independent Normal samples with unknown means and variances (unequal) t t df = * >> t > t df = * F (m-1, n -1) >> F > F (n-1, m -1)
  • The paired t test SituationTest statisticH0H0 HAHA Critical Region n matched pair of subjects are treated with two treatments. d i = x i y i has mean = t t df = n - 1 >> t > t df = n - 1
  • The test for independence (zero correlation) The test statistic: Reject H 0 if |t| > t a/2 (df = n 2) H 0 : X and Y are independent H A : X and Y are correlated The Critical region This is a two-tailed critical region, the critical region could also be one-tailed
  • Slide 166
  • Spearmans rank correlation coefficient (rho)
  • Slide 167
  • Spearmans rank correlation coefficient (rho) Spearmans rank correlation coefficient is computed as follows: Arrange the observations on X in increasing order and assign them the ranks 1, 2, 3, , n Arrange the observations on Y in increasing order and assign them the ranks 1, 2, 3, , n. For any case (i) let ( x i, y i ) denote the observations on X and Y and let ( r i, s i ) denote the ranks on X and Y.
  • Slide 168
  • Spearmans rank correlation coefficient is defined as follows: For each case let d i = r i s i = difference in the two ranks. Then Spearmans rank correlation coefficient ( ) is defined as follows:
  • Slide 169
  • Properties of Spearmans rank correlation coefficient 1.The value of is always between 1 and +1. 2.If the relationship between X and Y is positive, then will be positive. 3.If the relationship between X and Y is negative, then will be negative. 4.If there is no relationship between X and Y, then will be zero. 5.The value of will be +1 if the ranks of X completely agree with the ranks of Y. 6.The value of will be -1 if the ranks of X are in reverse order to the ranks of Y.
  • Slide 170
  • Relationship between Regression and Correlation
  • Slide 171
  • Recall Also since Thus the slope of the least squares line is simply the ratio of the standard deviations the correlation coefficient
  • Slide 172
  • The coefficient of Determination
  • Slide 173
  • Sums of Squares associated with Linear Regresssion = SS unexplained
  • Slide 174
  • It can be shown: (Total variability in Y) = (variability in Y explained by X) + (variability in Y unexplained by X)
  • Slide 175
  • It can also be shown: = proportion variability in Y explained by X. = the coefficient of determination
  • Slide 176
  • Further: = proportion variability in Y that is unexplained by X.
  • Slide 177
  • Regression (in general)
  • Slide 178
  • In many experiments we would have collected data on a single variable Y (the dependent variable ) and on p (say) other variables X 1, X 2, X 3,..., X p (the independent variables). One is interested in determining a model that describes the relationship between Y (the response (dependent) variable) and X 1, X 2, , X p (the predictor (independent) variables. This model can be used for Prediction Controlling Y by manipulating X 1, X 2, , X p
  • Slide 179
  • The Model: is an equation of the form Y = f(X 1, X 2,...,X p | 1, 2,..., q ) + where 1, 2,..., q are unknown parameters of the function f and is a random disturbance (usually assumed to have a normal distribution with mean 0 and standard deviation .
  • Slide 180
  • The Multiple Linear Regression Model
  • Slide 181
  • In Multiple Linear Regression we assume the following model Y = 0 + 1 X 1 + 2 X 2 +... + p X p + This model is called the Multiple Linear Regression Model. Again are unknown parameters of the model and where 0, 1, 2,..., p are unknown parameters and is a random disturbance assumed to have a normal distribution with mean 0 and standard deviation .
  • Slide 182
  • Summary of the Statistics used in Multiple Regression
  • Slide 183
  • The Least Squares Estimates: - the values that minimize
  • Slide 184
  • The Analysis of Variance Table Entries a) Adjusted Total Sum of Squares (SS Total ) b) Residual Sum of Squares (SS Error ) c) Regression Sum of Squares (SS Reg ) Note: i.e. SS Total = SS Reg +SS Error
  • Slide 185
  • The Analysis of Variance Table SourceSum of Squaresd.f.Mean SquareF RegressionSS Reg pSS Reg /p = MS Reg MS Reg /s 2 ErrorSS Error n-p-1SS Error /(n-p-1) =MS Error = s 2 TotalSS Total n-1
  • Slide 186
  • Uses: 1.To estimate 2 (the error variance). - Use s 2 = MS Error to estimate 2. 2.To test the Hypothesis H 0 : 1 = 2 =... = p = 0. Use the test statistic - Reject H 0 if F > F (p,n-p-1).
  • Slide 187
  • 3.To compute other statistics that are useful in describing the relationship between Y (the dependent variable) and X 1, X 2,...,X p (the independent variables). a)R 2 = the coefficient of determination = SS Reg /SS Total = = the proportion of variance in Y explained by X 1, X 2,...,X p 1 - R 2 = the proportion of variance in Y that is left unexplained by X 1, X2,..., X p = SS Error /SS Total.
  • Slide 188
  • b)R a 2 = "R 2 adjusted" for degrees of freedom. = 1 -[the proportion of variance in Y that is left unexplained by X 1, X 2,..., X p adjusted for d.f.]
  • Slide 189
  • c) R= R 2 = the Multiple correlation coefficient of Y with X 1, X 2,...,X p = = the maximum correlation between Y and a linear combination of X 1, X 2,...,X p Comment: The statistics F, R 2, R a 2 and R are equivalent statistics.
  • Slide 190
  • Logistic regression
  • Slide 191
  • The dependent variable y is binary. It takes on two values Success (1) or Failure (0) This is the situation in which Logistic Regression is used We are interested in predicting a y from a continuous dependent variable x.
  • Slide 192
  • The logisitic Regression Model Let p denote P[y = 1] = P[Success]. This quantity will increase with the value of x. The ratio: is called the odds ratio This quantity will also increase with the value of x, ranging from zero to infinity. The quantity: is called the log odds ratio
  • Slide 193
  • The logisitic Regression Model i. e. : In terms of the odds ratio Assumes the log odds ratio is linearly related to x.
  • Slide 194
  • The logisitic Regression Model Solving for p in terms x.
  • Slide 195
  • Interpretation of the parameter 0 (determines the intercept) p x
  • Slide 196
  • Interpretation of the parameter 1 (determines when p is 0.50 (along with 0 )) p x when
  • Slide 197
  • Interpretation of the parameter 1 (determines slope when p is 0.50 ) p x
  • Slide 198
  • The Multiple Logistic Regression model
  • Slide 199
  • Here we attempt to predict the outcome of a binary response variable Y from several independent variables X 1, X 2, etc
  • Slide 200
  • Nonparametric Statistical Methods
  • Slide 201
  • Definition When the data is generated from process (model) that is known except for finite number of unknown parameters the model is called a parametric model. Otherwise, the model is called a non- parametric model Statistical techniques that assume a non- parametric model are called non-parametric.
  • Slide 202
  • Nonparametric Statistical Methods
  • Slide 203
  • The sign test A nonparametric test for the central location of a distribution
  • Slide 204
  • To carry out the The Sign test: S = the number of observations that exceed 0 = s observed p-value = P [S s observed ] ( = 2 P [S s observed ] for 2-tailed test) where S is binomial, n = sample size, p = 0.50 1.Compute the test statistic: 2.Compute the p-value of test statistic, s observed : 3.Reject H 0 if p-value low (< 0.05)
  • Slide 205
  • Sign Test for Large Samples
  • Slide 206
  • If n is large we can use the Normal approximation to the Binomial. Namely S has a Binomial distribution with p = and n = sample size. Hence for large n, S has approximately a Normal distribution with mean and standard deviation
  • Slide 207
  • Hence for large n,use as the test statistic (in place of S) Choose the critical region for z from the Standard Normal distribution. i.e. Reject H 0 if z z /2 two tailed ( a one tailed test can also be set up.
  • Slide 208
  • Nonparametric Confidence Intervals
  • Slide 209
  • Now arrange the data x 1, x 2, x 3, x n in increasing order Assume that the data, x 1, x 2, x 3, x n is a sample from an unknown distribution. Hence x (1) < x (2) < x (3) < < x (n) x (1) = the smallest observation x (2) = the 2 nd smallest observation x (n) = the largest observation Consider the k th smallest observation and the k th largest observation in the data x 1, x 2, x 3, x n x (k) and x (n k + 1)
  • Slide 210
  • Hence P[x (k) < median < x (n k + 1) ] = p(k) + p(k + 1) + + p(n-k) = P = P[k the no. of obs greater than the median n-k] where p(i)s are binomial probabilities with n = the sample size and p =1/2. This means that x (k) to x (n k + 1) is a P(100)% confidence interval for the median Choose k so that P = p(k) + p(k + 1) + + p(n-k) is close to.95 (or 0.99)
  • Slide 211
  • Summarizing where P = p(k) + p(k + 1) + + p(n-k) and p(i)s are binomial probabilities with n = the sample size and p =1/2. x (k) to x (n k + 1) is a P(100)% confidence interval for the median
  • Slide 212
  • For large values of n one can use the normal approximation to the Binomial to find the value of k so that x (k) to x (n k + 1) is a 95% confidence interval for the median.
  • Slide 213
  • Slide 214
  • The Wilcoxon Signed Rank Test An Alternative to the sign test
  • Slide 215
  • For Wicoxons signed-Rank test we would assign ranks to the absolute values of (x 1 0, x 2 0, , x n 0 ). A rank of 1 to the value of x i 0 which is smallest in absolute value. A rank of n to the value of x i 0 which is largest in absolute value. W + = the sum of the ranks associated with positive values of x i 0. W - = the sum of the ranks associated with negative values of x i 0.
  • Slide 216
  • To carry out Wilcoxons signed rank test We 1.Compute T = W + or W - (usually it would be the smaller of the two) 2.Let t observed = the observed value of T. 3.Compute the p-value = P[T t observed ] (2 P[T t observed ] for a two-tailed test). i.For n 12 use the table. ii.For n > 12 use the Normal approximation. 4.Conclude H A (Reject H 0 ) if p-value is less than 0.05 (or 0.01).
  • Slide 217
  • For sample sizes, n > 12 we can use the fact that T (W + or W - ) has approximately a normal distribution with
  • Slide 218
  • 1.The t test i.This test requires the assumption of normality. ii.If the data is not normally distributed the test is invalid The probability of a type I error may not be equal to its desired value (0.05 or 0.01) iii.If the data is normally distributed, the t-test commits type II errors with a smaller probability than any other test (In particular Wilcoxons signed rank test or the sign test) 2.The sign test i.This test does not require the assumption of normality (true also for Wilcoxons signed rank test). ii.This test ignores the magnitude of the observations completely. Wilcoxons test takes the magnitude into account by ranking them Comments
  • Slide 219
  • Two-sample Non-parametic tests
  • Slide 220
  • Mann-Whitney Test A non-parametric two sample test for comparison of central location
  • Slide 221
  • The Mann-Whitney Test This is a non parametric alternative to the two sample t test (or z test) for independent samples. These tests (t and z) assume the data is normal The Mann- Whitney test does not make this assumption. Sample of n from population 1 x 1, x 2, x 3, , x n Sample of m from population 2 y 1, y 2, y 3, , y m
  • Slide 222
  • The Mann-Whitney test statistics U 1 and U 2 Arrange the observations from the two samples combined in increasing order (retaining sample membership) and assign ranks to the observations. Let W 1 = the sum of the ranks for sample 1. Let W 2 = the sum of the ranks for sample 2. Then and
  • Slide 223
  • The distribution function of U (U 1 or U 2 ) has been tabled for various values of n and m (
  • The Mann-Whitney test for large samples For large samples (n > 10 and m >10) the statistics U 1 and U 2 have approximately a Normal distribution with mean and standard deviation
  • Slide 225
  • Thus we can convert U i to a standard normal statistic And reject H 0 if z z /2 (for a two tailed test)
  • Slide 226
  • The Kruskal Wallis Test Comparing the central location for k populations An nonparametric alternative to the one-way ANOVA F-test
  • Slide 227
  • Situation: Data is collected from k populations. The sample size from population i is n i. The data from population i is:
  • Slide 228
  • The computation of The Kruskal-Wallis statistic We group the N = n 1 + n 2 + + n k observation from k populations together and rank these observations from 1 to N. Let r ij be the rank associated with with the observation x ij. Handling of tied observations If a group of observations are equal the ranks that would have been assigned to those observations are averaged
  • Slide 229
  • The Kruskal-Wallis statistic where = the sum of the ranks for the i th sample
  • Slide 230
  • The Kruskal-Wallis test Reject H 0 : the k populations have same central location
  • Slide 231
  • Probability Theory Probability Models for random phenomena
  • Slide 232
  • Definitions
  • Slide 233
  • The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.
  • Slide 234
  • An Event, E The event, E, is any subset of the sample space, S. i.e. any set of outcomes (not necessarily all outcomes) of the random phenomena S E Venn diagram
  • Slide 235
  • The event, E, is said to have occurred if after the outcome has been observed the outcome lies in E. S E
  • Slide 236
  • Set operations on Events Union Let A and B be two events, then the union of A and B is the event (denoted by A B) defined by: A B = {e| e belongs to A or e belongs to B} A B AB
  • Slide 239
  • AB The event A B occurs if the event A occurs and the event and B occurs.
  • Slide 240
  • Complement Let A be any event, then the complement of A (denoted by ) defined by: = {e| e does not belongs to A} A
  • Slide 241
  • The event occurs if the event A does not occur A
  • Slide 242
  • In problems you will recognize that you are working with: 1.Union if you see the word or, 2.Intersection if you see the word and, 3.Complement if you see the word not.
  • Slide 243
  • Definition: mutually exclusive Two events A and B are called mutually exclusive if: A B
  • Slide 244
  • If two events A and B are are mutually exclusive then: A B 1.They have no outcomes in common. They cant occur at the same time. The outcome of the random experiment can not belong to both A and B.
  • Slide 245
  • Rules of Probability
  • Slide 246
  • The additive rule P[A B] = P[A] + P[B] P[A B] and if A B = P[A B] = P[A] + P[B]
  • Slide 247
  • The Rule for complements for any event E
  • Slide 248
  • Conditional probability
  • Slide 249
  • The multiplicative rule of probability and if A and B are independent. This is the definition of independent
  • Slide 250
  • Counting techniques
  • Slide 251
  • Summary of counting rules Rule 1 n(A 1 A 2 A 3 . ) = n(A 1 ) + n(A 2 ) + n(A 3 ) + if the sets A 1, A 2, A 3, are pairwise mutually exclusive (i.e. A i A j = ) Rule 2 n 1 = the number of ways the first operation can be performed n 2 = the number of ways the second operation can be performed once the first operation has been completed. N = n 1 n 2 = the number of ways that two operations can be performed in sequence if
  • Slide 252
  • Rule 3 n 1 = the number of ways the first operation can be performed n i = the number of ways the i th operation can be performed once the first (i - 1) operations have been completed. i = 2, 3, , k N = n 1 n 2 n k = the number of ways the k operations can be performed in sequence if
  • Slide 253
  • Basic counting formulae 1.Orderings 2.Permutations The number of ways that you can choose k objects from n in a specific order 3.Combinations The number of ways that you can choose k objects from n (order of selection irrelevant)
  • Slide 254
  • Random Variables Numerical Quantities whose values are determine by the outcome of a random experiment
  • Slide 255
  • Random variables are either Discrete Integer valued The set of possible values for X are integers Continuous The set of possible values for X are all real numbers Range over a continuum.
  • Slide 256
  • The Probability distribution of A random variable A Mathematical description of the possible values of the random variable together with the probabilities of those values
  • Slide 257
  • The probability distribution of a discrete random variable is describe by its : probability function p(x). p(x) = the probability that X takes on the value x. This can be given in either a tabular form or in the form of an equation. It can also be displayed in a graph.
  • Slide 258
  • Comments: Every probability function must satisfy: 1.The probability assigned to each value of the random variable must be between 0 and 1, inclusive: 2.The sum of the probabilities assigned to all the values of the random variable must equal 1: 3.
  • Slide 259
  • Probability Distributions of Continuous Random Variables
  • Slide 260
  • Probability Density Function The probability distribution of a continuous random variable is describe by probability density curve f(x).
  • Slide 261
  • Notes: The Total Area under the probability density curve is 1. The Area under the probability density curve is from a to b is P[a < X < b].
  • Slide 262
  • Normal Probability Distributions (Bell shaped curve)
  • Slide 263
  • Mean, Variance and standard deviation of Random Variables Numerical descriptors of the distribution of a Random Variable
  • Slide 264
  • Mean of a Discrete Random Variable The mean, , of a discrete random variable x is found by multiplying each possible value of x by its own probability and then adding all the products together: Notes: The mean is a weighted average of the values of X. The mean is the long-run average value of the random variable. The mean is centre of gravity of the probability distribution of the random variable
  • Slide 265
  • 2 Variance of a Discrete Random Variable: Variance, 2, of a discrete random variable x is found by multiplying each possible value of the squared deviation from the mean, (x ) 2, by its own probability and then adding all the products together: Standard Deviation of a Discrete Random Variable: The positive square root of the variance:
  • Slide 266
  • The Binomial distribution An important discrete distribution
  • Slide 267
  • X is said to have the Binomial distribution with parameters n and p. 1. X is the number of successes occurring in the n repetitions of a Success-Failure Experiment. 2.The probability of success is p. 3. The probability function
  • Slide 268
  • Mean,Variance & Standard Deviation of the Binomial Ditribution The mean, variance and standard deviation of the binomial distribution can be found by using the following three formulas:
  • Slide 269
  • Mean of a Continuous Random Variable (uses calculus) The mean, , of a discrete random variable x Notes: The mean is a weighted average of the values of X. The mean is the long-run average value of the random variable. The mean is centre of gravity of the probability distribution of the random variable
  • Slide 270
  • Variance of a Continuous Random Variable Standard Deviation of a Continuous Random Variable: The positive square root of the variance:
  • Slide 271
  • The Normal Probability Distribution Points of Inflection
  • Slide 272
  • Main characteristics of the Normal Distribution Bell Shaped, symmetric Points of inflection on the bell shaped curve are at and + That is one standard deviation from the mean Area under the bell shaped curve between and + is approximately 2/3. Area under the bell shaped curve between 2 and + 2 is approximately 95%.
  • Slide 273
  • Normal approximation to the Binomial distribution Using the Normal distribution to calculate Binomial probabilities
  • Slide 274
  • Normal Approximation to the Binomial distribution X has a Binomial distribution with parameters n and p Y has a Normal distribution
  • Slide 275
  • Sampling Theory Determining the distribution of Sample statistics
  • Slide 276
  • The distribution of the sample mean
  • Slide 277
  • Thus if x 1, x 2, , x n denote n independent random variables each coming from the same Normal distribution with mean and standard deviation . Then has Normal distribution with
  • Slide 278
  • The Central Limit Theorem The Central Limit Theorem (C.L.T.) states that if n is sufficiently large, the sample means of random samples from any population with mean and finite standard deviation are approximately normally distributed with mean and standard deviation. Technical Note: The mean and standard deviation given in the CLT hold for any sample size; it is only the approximately normal shape that requires n to be sufficiently large.
  • Slide 279
  • Graphical Illustration of the Central Limit Theorem Original Population 30 Distribution of x: n = 10 Distribution of x: n = 30 Distribution of x: n = 2 30
  • Slide 280
  • Implications of the Central Limit Theorem The Conclusion that the sampling distribution of the sample mean is Normal, will to true if the sample size is large (>30). (even though the population may be non- normal). When the population can be assumed to be normal, the sampling distribution of the sample mean is Normal, will to true for any sample size. Knowing the sampling distribution of the sample mean allows to answer probability questions related to the sample mean.
  • Slide 281
  • Sampling Distribution of a Sample Proportion
  • Slide 282
  • Sampling Distribution for Sample Proportions Let p = population proportion of interest or binomial probability of success. Let is approximately a normal distribution with = sample proportion or proportion of successes.
  • Slide 283
  • Sampling distribution of a differences
  • Slide 284
  • If X, Yare independent normal random variables, then : X Y is normal with Note
  • Slide 285
  • Sampling distribution of a difference in two Sample means
  • Slide 286
  • Situation We have two normal populations (1 and 2) Let 1 and 1 denote the mean and standard deviation of population 1. Let 2 and 2 denote the mean and standard deviation of population 2. Let x 1, x 2, x 3, , x n denote a sample from a normal population 1. Let y 1, y 2, y 3, , y m denote a sample from a normal population 2. Objective is to compare the two population means
  • Slide 287
  • Then
  • Slide 288
  • Sampling distribution of a difference in two Sample proportions
  • Slide 289
  • Situation Suppose we have two Success-Failure experiments Let p 1 = the probability of success for experiment 1. Let p 2 = the probability of success for experiment 2. Suppose that experiment 1 is repeated n 1 times and experiment 2 is repeated n 2 Let x 1 = the no. of successes in the n 1 repititions of experiment 1, x 2 = the no. of successes in the n 2 repititions of experiment 2.
  • Slide 290
  • Then
  • Slide 291
  • The Chi-square ( 2 ) distribution
  • Slide 292
  • The Chi-squared distribution with degrees of freedom Comment: If z 1, z 2,..., z are independent random variables each having a standard normal distribution then U = has a chi-squared distribution with degrees of freedom.
  • Slide 293
  • The Chi-squared distribution with degrees of freedom - degrees of freedom
  • Slide 294
  • 2 d.f. 3 d.f. 4 d.f.
  • Slide 295
  • Statistics that have the Chi-squared distribution: This statistic is used to detect independence between two categorical variables d.f. = (r 1)(c 1)
  • Slide 296
  • Let x 1, x 2, , x n denote a sample from the normal distribution with mean and standard deviation , then has a chi-square distribution with d.f. = n 1.