41
Discrete Distributions Bernoulli f (x) = p x (1 - p) 1-x , x = 0, 1 0 < p < 1 M(t) = 1 - p + pe t , -∞ < t < μ = p, σ 2 = p(1 - p) Binomial f (x) = n! x!(n - x)! p x (1 - p) n-x , x = 0, 1, 2, ... , n b(n, p) 0 < p < 1 M(t) = (1 - p + pe t ) n , -∞ < t < μ = np, σ 2 = np(1 - p) Geometric f (x) = (1 - p) x-1 p, x = 1, 2, 3, ... 0 < p < 1 M(t) = pe t 1 - (1 - p)e t , t < - ln(1 - p) μ = 1 p , σ 2 = 1 - p p 2 Hypergeometric f (x) = N 1 x N 2 n - x N n , x n, x N 1 , n - x N 2 N 1 > 0, N 2 > 0 N = N 1 + N 2 μ = n N 1 N , σ 2 = n N 1 N N 2 N N - n N - 1 Negative Binomial f (x) = x - 1 r - 1 p r (1 - p) x-r , x = r, r + 1, r + 2, ... 0 < p < 1 r = 1, 2, 3, ... M(t) = (pe t ) r [1 - (1 - p)e t ] r , t < - ln(1 - p) μ = r 1 p , σ 2 = r(1 - p) p 2 Poisson f (x) = λ x e -λ x! , x = 0, 1, 2, ... λ > 0 M(t) = e λ(e t -1) , -∞ < t < μ = λ, σ 2 = λ Uniform f (x) = 1 m , x = 1, 2, ... , m m > 0 μ = m + 1 2 , σ 2 = m 2 - 1 12

Discrete Distributions - Kennesaw State University

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Discrete Distributions - Kennesaw State University

Discrete DistributionsBernoulli f (x) = px(1 − p)1−x, x = 0, 10 < p < 1 M(t) = 1 − p + pet, −∞ < t < ∞

µ = p, σ 2 = p(1 − p)

Binomial f (x) = n!x!(n − x)! px(1 − p)n−x, x = 0, 1, 2, . . . , n

b(n, p)0 < p < 1 M(t) = (1 − p + pet)n, −∞ < t < ∞

µ = np, σ 2 = np(1 − p)

Geometric f (x) = (1 − p)x−1p, x = 1, 2, 3, . . .0 < p < 1

M(t) = pet

1 − (1 − p)et , t < − ln(1 − p)

µ = 1p

, σ 2 = 1 − pp2

Hypergeometric f (x) =

(N1

x

)(N2

n − x

)

(Nn

) , x ≤ n, x ≤ N1, n − x ≤ N2

N1 > 0, N2 > 0N = N1 + N2

µ = n(

N1

N

), σ 2 = n

(N1

N

)(N2

N

)(N − nN − 1

)

Negative Binomial f (x) =(

x − 1r − 1

)pr(1 − p)x−r, x = r, r + 1, r + 2, . . .

0 < p < 1

r = 1, 2, 3, . . . M(t) = (pet)r

[1 − (1 − p)et]r , t < − ln(1 − p)

µ = r(

1p

), σ 2 = r(1 − p)

p2

Poisson f (x) = λxe−λ

x! , x = 0, 1, 2, . . .λ > 0

M(t) = eλ(et−1), −∞ < t < ∞µ = λ, σ 2 = λ

Uniform f (x) = 1m

, x = 1, 2, . . . , mm > 0

µ = m + 12

, σ 2 = m2 − 112

Page 2: Discrete Distributions - Kennesaw State University

Continuous Distributions

Beta f (x) = !(α + β)!(α)!(β)

xα−1(1 − x)β−1, 0 < x < 1α > 0β > 0 µ = α

α + β, σ 2 = αβ

(α + β + 1)(α + β)2

Chi-square f (x) = 1!(r/2)2r/2 xr/2−1e−x/2, 0 < x < ∞

χ2(r)r = 1, 2, . . . M(t) = 1

(1 − 2t)r/2 , t <12

µ = r, σ 2 = 2r

Exponential f (x) = 1θ

e−x/θ , 0 ≤ x < ∞θ > 0

M(t) = 11 − θ t

, t <1θ

µ = θ , σ 2 = θ2

Gamma f (x) = 1!(α)θα

xα−1e−x/θ , 0 < x < ∞α > 0θ > 0 M(t) = 1

(1 − θ t)α, t <

µ = αθ , σ 2 = αθ2

Normal f (x) = 1

σ√

2πe−(x−µ)2/2σ 2

, −∞ < x < ∞N(µ, σ 2)−∞ < µ < ∞ M(t) = eµt+σ 2t2/2, −∞ < t < ∞σ > 0 E(X) = µ, Var(X) = σ 2

Uniform f (x) = 1b − a

, a ≤ x ≤ b

U(a, b)

−∞ < a < b < ∞ M(t) = etb − eta

t(b − a), t %= 0; M(0) = 1

µ = a + b2

, σ 2 = (b − a)2

12

Page 3: Discrete Distributions - Kennesaw State University
Page 4: Discrete Distributions - Kennesaw State University
Page 5: Discrete Distributions - Kennesaw State University
Page 6: Discrete Distributions - Kennesaw State University
Page 7: Discrete Distributions - Kennesaw State University

Chapte rChapte r

4Bivariate Distributions

4.1 Bivariate Distributions of the Discrete Type4.2 The Correlation Coefficient4.3 Conditional Distributions

4.4 Bivariate Distributions of the ContinuousType

4.5 The Bivariate Normal Distribution

4.1 BIVARIATE DISTRIBUTIONS OF THE DISCRETE TYPESo far, we have taken only one measurement on a single item under observation.However, it is clear in many practical cases that it is possible, and often very desir-able, to take more than one measurement of a random observation. Suppose, forexample, that we are observing female college students to obtain information aboutsome of their physical characteristics, such as height, x, and weight, y, because we aretrying to determine a relationship between those two characteristics. For instance,there may be some pattern between height and weight that can be described byan appropriate curve y = u(x). Certainly, not all of the points observed will beon this curve, but we want to attempt to find the “best” curve to describe therelationship and then say something about the variation of the points around thecurve.

Another example might concern high school rank—say, x—and the ACT(or SAT) score—say, y—of incoming college students. What is the relationshipbetween these two characteristics? More importantly, how can we use those mea-surements to predict a third one, such as first-year college GPA—say, z—witha function z = v(x, y)? This is a very important problem for college admissionoffices, particularly when it comes to awarding an athletic scholarship, because theincoming student–athlete must satisfy certain conditions before receiving such anaward.

Definition 4.1-1Let X and Y be two random variables defined on a discrete space. Let S denotethe corresponding two-dimensional space of X and Y, the two random vari-ables of the discrete type. The probability that X = x and Y = y is denoted byf (x, y) = P(X = x, Y = y). The function f (x, y) is called the joint probabilitymass function (joint pmf) of X and Y and has the following properties:

125

126 Chapter 4 Bivariate Distributions

(a) 0 ≤ f (x, y) ≤ 1.

(b)∑ ∑

(x,y)∈S

f (x, y) = 1.

(c) P[(X, Y) ∈ A] =∑ ∑

(x,y)∈A

f (x, y), where A is a subset of the space S.

The following example will make this definition more meaningful.

Example4.1-1

Roll a pair of fair dice. For each of the 36 sample points with probability 1/36, letX denote the smaller and Y the larger outcome on the dice. For example, if theoutcome is (3, 2), then the observed values are X = 2, Y = 3. The event {X = 2,Y = 3} could occur in one of two ways—(3, 2) or (2, 3)—so its probability is

136

+ 136

= 236

.

If the outcome is (2, 2), then the observed values are X = 2, Y = 2. Since the event{X = 2, Y = 2} can occur in only one way, P(X = 2, Y = 2) = 1/36. The joint pmfof X and Y is given by the probabilities

f (x, y) =

136

, 1 ≤ x = y ≤ 6,

236

, 1 ≤ x < y ≤ 6,

when x and y are integers. Figure 4.1-1 depicts the probabilities of the various pointsof the space S.

1/36

2/36 1/36

1/36

2/36

1/36

2/36

2/36

2/36

1/36

2/36

2/36

2/36

5/36 3/367/36

y

2/36

2/36

2/36

2/36

1/36

1/36

x

11/36

2/36

2/36

2/36

5/36

9/36

1/36

11/364 53 621

7/36

3/36

9/36

6

5

4

3

2

1

Figure 4.1-1 Discrete joint pmf

Section 4.1 Bivariate Distributions of the Discrete Type 127

Notice that certain numbers have been recorded in the bottom and left-handmargins of Figure 4.1-1. These numbers are the respective column and row totalsof the probabilities. The column totals are the respective probabilities that X willassume the values in the x space SX = {1, 2, 3, 4, 5, 6}, and the row totals arethe respective probabilities that Y will assume the values in the y space SY ={1, 2, 3, 4, 5, 6}. That is, the totals describe the probability mass functions of X andY, respectively. Since each collection of these probabilities is frequently recordedin the margins and satisfies the properties of a pmf of one random variable, each iscalled a marginal pmf.

Definition 4.1-2Let X and Y have the joint probability mass function f (x, y) with space S. Theprobability mass function of X alone, which is called the marginal probabilitymass function of X, is defined by

fX(x) =∑

yf (x, y) = P(X = x), x ∈ SX ,

where the summation is taken over all possible y values for each given x in thex space SX . That is, the summation is over all (x, y) in S with a given x value.Similarly, the marginal probability mass function of Y is defined by

fY(y) =∑

xf (x, y) = P(Y = y), y ∈ SY ,

where the summation is taken over all possible x values for each given y in they space SY . The random variables X and Y are independent if and only if, forevery x ∈ SX and every y ∈ SY ,

P(X = x, Y = y) = P(X = x)P(Y = y)

or, equivalently,

f (x, y) = fX(x)fY(y);

otherwise, X and Y are said to be dependent.

We note in Example 4.1-1 that X and Y are dependent because there are manyx and y values for which f (x, y) "= fX(x)fY(y). For instance,

fX(1)fY(1) =(

1136

)(1

36

)"= 1

36= f (1, 1).

Example4.1-2

Let the joint pmf of X and Y be defined by

f (x, y) = x + y21

, x = 1, 2, 3, y = 1, 2.

Then

fX(x) =∑

yf (x, y) =

2∑

y=1

x + y21

= x + 121

+ x + 221

= 2x + 321

, x = 1, 2, 3,

Page 8: Discrete Distributions - Kennesaw State University

48 Probability and Statistics for Computer Scientists

(a) E(X) = 0.5

!0 0.5 1

(b) E(X) = 0.25

!0 0.25 0.5 1

FIGURE 3.3: Expectation as a center of gravity.

Similar arguments can be used to derive the general formula for the expectation.

Expectation,discrete case

µ = E(X) =∑

x

xP (x) (3.3)

This formula returns the center of gravity for a system with masses P (x) allocated at pointsx. Expected value is often denoted by a Greek letter µ.

In a certain sense, expectation is the best forecast of X . The variable itself is random. Ittakes different values with different probabilities P (x). At the same time, it has just oneexpectation E(X) which is non-random.

3.3.2 Expectation of a function

Often we are interested in another variable, Y , that is a function of X . For example, down-loading time depends on the connection speed, profit of a computer store depends on thenumber of computers sold, and bonus of its manager depends on this profit. Expectation ofY = g(X) is computed by a similar formula,

E {g(X)} =∑

x

g(x)P (x). (3.4)

Remark: Indeed, if g is a one-to-one function, then Y takes each value y = g(x) with probability

P (x), and the formula for E(Y ) can be applied directly. If g is not one-to-one, then some values ofg(x) will be repeated in (3.4). However, they are still multiplied by the corresponding probabilities.

When we add in (3.4), these probabilities are also added, thus each value of g(x) is still multipliedby the probability PY (g(x)).

3.3.3 Properties

The following linear properties of expectations follow directly from (3.3) and (3.4). For anyrandom variables X and Y and any non-random numbers a, b, and c, we have

50 Probability and Statistics for Computer Scientists

Example 3.10. Here is a rather artificial but illustrative scenario. Consider two users.One receives either 48 or 52 e-mail messages per day, with a 50-50% chance of each. Theother receives either 0 or 100 e-mails, also with a 50-50% chance. What is a common featureof these two distributions, and how are they different?

We see that both users receive the same average number of e-mails:

E(X) = E(Y ) = 50.

However, in the first case, the actual number of e-mails is always close to 50, whereas italways differs from it by 50 in the second case. The first random variable, X , is more stable;it has low variability. The second variable, Y , has high variability. ♦

This example shows that variability of a random variable is measured by its distance fromthe mean µ = E(X). In its turn, this distance is random too, and therefore, cannot serveas a characteristic of a distribution. It remains to square it and take the expectation of theresult.

DEFINITION 3.6

Variance of a random variable is defined as the expected squared deviationfrom the mean. For discrete random variables, variance is

σ2 = Var(X) = E (X − EX)2 =∑

x

(x− µ)2P (x)

Remark: Notice that if the distance to the mean is not squared, then the result is always µ−µ = 0bearing no information about the distribution of X.

According to this definition, variance is always non-negative. Further, it equals 0 only ifx = µ for all values of x, i.e., when X is constantly equal to µ. Certainly, a constant(non-random) variable has zero variability.

Variance can also be computed as

Var(X) = E(X2)− µ2. (3.6)

A proof of this is left as Exercise 3.38a.

DEFINITION 3.7

Standard deviation is a square root of variance,

σ = Std(X) =√

Var(X)

Continuing the Greek-letter tradition, variance is often denoted by σ2. Then, standarddeviation is σ.

If X is measured in some units, then its mean µ has the same measurement unit as X .Variance σ2 is measured in squared units, and therefore, it cannot be compared with X orµ. No matter how funny it sounds, it is rather normal to measure variance of profit in squareddollars, variance of class enrollment in squared students, and variance of available disk spacein squared gigabytes. When a squared root is taken, the resulting standard deviation σ isagain measured in the same units as X . This is the main reason of introducing yet anothermeasure of variability, σ.

Discrete Random Variables and Their Distributions 49

Propertiesof

expectations

E(aX + bY + c) = aE(X) + bE(Y ) + c

In particular,E(X + Y ) = E(X) + E(Y )E(aX) = aE(X)E(c) = c

For independent X and Y ,E(XY ) = E(X)E(Y )

(3.5)

Proof: The first property follows from the Addition Rule (3.2). For any X and Y ,

E(aX + bY + c) =∑

x

y

(ax+ by + c)P(X,Y )(x, y)

=∑

x

ax∑

y

P(X,Y )(x, y) +∑

y

by∑

x

P(X,Y )(x, y) + c∑

x

y

P(X,Y )(x, y)

= a∑

x

xPX(x) + b∑

y

yPY (y) + c.

The next three equalities are special cases. To prove the last property, we recall that P(X,Y )(x, y) =

PX(x)PY (y) for independent X and Y , and therefore,

E(XY ) =∑

x

y

(xy)PX(x)PY (y) =∑

x

xPX(x)∑

y

yPY (y) = E(X)E(Y ). !

Remark: The last property in (3.5) holds for some dependent variables too, hence it cannot be

used to verify independence of X and Y .

Example 3.9. In Example 3.6 on p. 46,

E(X) = (0)(0.5) + (1)(0.5) = 0.5 and

E(Y ) = (0)(0.4) + (1)(0.3) + (2)(0.15) + (3)(0.15) = 1.05,

therefore, the expected total number of errors is

E(X + Y ) = 0.5 + 1.05 = 1.65.

Remark: Clearly, the program will never have 1.65 errors, because the number of errors is alwaysinteger. Then, should we round 1.65 to 2 errors? Absolutely not, it would be a mistake. Although

both X and Y are integers, their expectations, or average values, do not have to be integers at all.

3.3.4 Variance and standard deviation

Expectation shows where the average value of a random variable is located, or where thevariable is expected to be, plus or minus some error. How large could this “error” be, andhow much can a variable vary around its expectation? Let us introduce some measures ofvariability.

50 Probability and Statistics for Computer Scientists

Example 3.10. Here is a rather artificial but illustrative scenario. Consider two users.One receives either 48 or 52 e-mail messages per day, with a 50-50% chance of each. Theother receives either 0 or 100 e-mails, also with a 50-50% chance. What is a common featureof these two distributions, and how are they different?

We see that both users receive the same average number of e-mails:

E(X) = E(Y ) = 50.

However, in the first case, the actual number of e-mails is always close to 50, whereas italways differs from it by 50 in the second case. The first random variable, X , is more stable;it has low variability. The second variable, Y , has high variability. ♦

This example shows that variability of a random variable is measured by its distance fromthe mean µ = E(X). In its turn, this distance is random too, and therefore, cannot serveas a characteristic of a distribution. It remains to square it and take the expectation of theresult.

DEFINITION 3.6

Variance of a random variable is defined as the expected squared deviationfrom the mean. For discrete random variables, variance is

σ2 = Var(X) = E (X − EX)2 =∑

x

(x− µ)2P (x)

Remark: Notice that if the distance to the mean is not squared, then the result is always µ−µ = 0bearing no information about the distribution of X.

According to this definition, variance is always non-negative. Further, it equals 0 only ifx = µ for all values of x, i.e., when X is constantly equal to µ. Certainly, a constant(non-random) variable has zero variability.

Variance can also be computed as

Var(X) = E(X2)− µ2. (3.6)

A proof of this is left as Exercise 3.38a.

DEFINITION 3.7

Standard deviation is a square root of variance,

σ = Std(X) =√

Var(X)

Continuing the Greek-letter tradition, variance is often denoted by σ2. Then, standarddeviation is σ.

If X is measured in some units, then its mean µ has the same measurement unit as X .Variance σ2 is measured in squared units, and therefore, it cannot be compared with X orµ. No matter how funny it sounds, it is rather normal to measure variance of profit in squareddollars, variance of class enrollment in squared students, and variance of available disk spacein squared gigabytes. When a squared root is taken, the resulting standard deviation σ isagain measured in the same units as X . This is the main reason of introducing yet anothermeasure of variability, σ.

50 Probability and Statistics for Computer Scientists

Example 3.10. Here is a rather artificial but illustrative scenario. Consider two users.One receives either 48 or 52 e-mail messages per day, with a 50-50% chance of each. Theother receives either 0 or 100 e-mails, also with a 50-50% chance. What is a common featureof these two distributions, and how are they different?

We see that both users receive the same average number of e-mails:

E(X) = E(Y ) = 50.

However, in the first case, the actual number of e-mails is always close to 50, whereas italways differs from it by 50 in the second case. The first random variable, X , is more stable;it has low variability. The second variable, Y , has high variability. ♦

This example shows that variability of a random variable is measured by its distance fromthe mean µ = E(X). In its turn, this distance is random too, and therefore, cannot serveas a characteristic of a distribution. It remains to square it and take the expectation of theresult.

DEFINITION 3.6

Variance of a random variable is defined as the expected squared deviationfrom the mean. For discrete random variables, variance is

σ2 = Var(X) = E (X − EX)2 =∑

x

(x− µ)2P (x)

Remark: Notice that if the distance to the mean is not squared, then the result is always µ−µ = 0bearing no information about the distribution of X.

According to this definition, variance is always non-negative. Further, it equals 0 only ifx = µ for all values of x, i.e., when X is constantly equal to µ. Certainly, a constant(non-random) variable has zero variability.

Variance can also be computed as

Var(X) = E(X2)− µ2. (3.6)

A proof of this is left as Exercise 3.38a.

DEFINITION 3.7

Standard deviation is a square root of variance,

σ = Std(X) =√

Var(X)

Continuing the Greek-letter tradition, variance is often denoted by σ2. Then, standarddeviation is σ.

If X is measured in some units, then its mean µ has the same measurement unit as X .Variance σ2 is measured in squared units, and therefore, it cannot be compared with X orµ. No matter how funny it sounds, it is rather normal to measure variance of profit in squareddollars, variance of class enrollment in squared students, and variance of available disk spacein squared gigabytes. When a squared root is taken, the resulting standard deviation σ isagain measured in the same units as X . This is the main reason of introducing yet anothermeasure of variability, σ.

Discrete Random Variables and Their Distributions 51

3.3.5 Covariance and correlation

Expectation, variance, and standard deviation characterize the distribution of a single ran-dom variable. Now we introduce measures of association of two random variables.

!

"

!

"

!

"

Y

X

Y

X

Y

X

(a) Cov(X,Y ) > 0 (b) Cov(X,Y ) < 0 (c) Cov(X,Y ) = 0

FIGURE 3.4: Positive, negative, and zero covariance.

DEFINITION 3.8

Covariance σXY = Cov(X,Y ) is defined as

Cov(X,Y ) = E {(X − EX)(Y − EY )}= E(XY )− E(X)E(Y )

It summarizes interrelation of two random variables.

Covariance is the expected product of deviations of X and Y from their respective expecta-tions. If Cov(X,Y ) > 0, then positive deviations (X− EX) are more likely to be multipliedby positive (Y − EY ), and negative (X− EX) are more likely to be multiplied by negative(Y − EY ). In short, large X imply large Y , and small X imply small Y . These variablesare positively correlated, Figure 3.4a.

Conversely, Cov(X,Y ) < 0 means that large X generally correspond to small Y and smallX correspond to large Y . These variables are negatively correlated, Figure 3.4b.

If Cov(X,Y ) = 0, we say that X and Y are uncorrelated, Figure 3.4c.

DEFINITION 3.9

Correlation coefficient between variables X and Y is defined as

ρ =Cov(X,Y )

( StdX)( StdY )

Correlation coefficient is a rescaled, normalized covariance. Notice that covarianceCov(X,Y ) has a measurement unit. It is measured in units of X multiplied by units ofY . As a result, it is not clear from its value whether X and Y are strongly or weakly corre-lated. Really, one has to compare Cov(X,Y ) with the magnitude of X and Y . Correlationcoefficient performs such a comparison, and as a result, it is dimensionless.

Discrete Random Variables and Their Distributions 51

3.3.5 Covariance and correlation

Expectation, variance, and standard deviation characterize the distribution of a single ran-dom variable. Now we introduce measures of association of two random variables.

!

"

!

"

!

"

Y

X

Y

X

Y

X

(a) Cov(X,Y ) > 0 (b) Cov(X,Y ) < 0 (c) Cov(X,Y ) = 0

FIGURE 3.4: Positive, negative, and zero covariance.

DEFINITION 3.8

Covariance σXY = Cov(X,Y ) is defined as

Cov(X,Y ) = E {(X − EX)(Y − EY )}= E(XY )− E(X)E(Y )

It summarizes interrelation of two random variables.

Covariance is the expected product of deviations of X and Y from their respective expecta-tions. If Cov(X,Y ) > 0, then positive deviations (X− EX) are more likely to be multipliedby positive (Y − EY ), and negative (X− EX) are more likely to be multiplied by negative(Y − EY ). In short, large X imply large Y , and small X imply small Y . These variablesare positively correlated, Figure 3.4a.

Conversely, Cov(X,Y ) < 0 means that large X generally correspond to small Y and smallX correspond to large Y . These variables are negatively correlated, Figure 3.4b.

If Cov(X,Y ) = 0, we say that X and Y are uncorrelated, Figure 3.4c.

DEFINITION 3.9

Correlation coefficient between variables X and Y is defined as

ρ =Cov(X,Y )

( StdX)( StdY )

Correlation coefficient is a rescaled, normalized covariance. Notice that covarianceCov(X,Y ) has a measurement unit. It is measured in units of X multiplied by units ofY . As a result, it is not clear from its value whether X and Y are strongly or weakly corre-lated. Really, one has to compare Cov(X,Y ) with the magnitude of X and Y . Correlationcoefficient performs such a comparison, and as a result, it is dimensionless.

Discrete Random Variables and Their Distributions 51

3.3.5 Covariance and correlation

Expectation, variance, and standard deviation characterize the distribution of a single ran-dom variable. Now we introduce measures of association of two random variables.

!

"

!

"

!

"

Y

X

Y

X

Y

X

(a) Cov(X,Y ) > 0 (b) Cov(X,Y ) < 0 (c) Cov(X,Y ) = 0

FIGURE 3.4: Positive, negative, and zero covariance.

DEFINITION 3.8

Covariance σXY = Cov(X,Y ) is defined as

Cov(X,Y ) = E {(X − EX)(Y − EY )}= E(XY )− E(X)E(Y )

It summarizes interrelation of two random variables.

Covariance is the expected product of deviations of X and Y from their respective expecta-tions. If Cov(X,Y ) > 0, then positive deviations (X− EX) are more likely to be multipliedby positive (Y − EY ), and negative (X− EX) are more likely to be multiplied by negative(Y − EY ). In short, large X imply large Y , and small X imply small Y . These variablesare positively correlated, Figure 3.4a.

Conversely, Cov(X,Y ) < 0 means that large X generally correspond to small Y and smallX correspond to large Y . These variables are negatively correlated, Figure 3.4b.

If Cov(X,Y ) = 0, we say that X and Y are uncorrelated, Figure 3.4c.

DEFINITION 3.9

Correlation coefficient between variables X and Y is defined as

ρ =Cov(X,Y )

( StdX)( StdY )

Correlation coefficient is a rescaled, normalized covariance. Notice that covarianceCov(X,Y ) has a measurement unit. It is measured in units of X multiplied by units ofY . As a result, it is not clear from its value whether X and Y are strongly or weakly corre-lated. Really, one has to compare Cov(X,Y ) with the magnitude of X and Y . Correlationcoefficient performs such a comparison, and as a result, it is dimensionless.

Page 9: Discrete Distributions - Kennesaw State University

52 Probability and Statistics for Computer Scientists

!

"

X

Y

!

"

X

Y

ρ = 1 ρ = −1

FIGURE 3.5: Perfect correlation: ρ = ±1.

How do we interpret the value of ρ? What possible values can it take?

As a special case of famous Cauchy-Schwarz inequality,

−1 ≤ ρ ≤ 1,

where |ρ| = 1 is possible only when all values of X and Y lie on a straight line, as inFigure 3.5. Further, values of ρ near 1 indicate strong positive correlation, values near (−1)show strong negative correlation, and values near 0 show weak correlation or no correlation.

3.3.6 Properties

The following properties of variances, covariances, and correlation coefficients hold for anyrandom variables X , Y , Z, and W and any non-random numbers a, b, c and d.

Properties of variances and covariances

Var(aX + bY + c) = a2 Var(X) + b2 Var(Y ) + 2abCov(X,Y )

Cov(aX + bY, cZ + dW )= acCov(X,Z) + adCov(X,W ) + bcCov(Y, Z) + bdCov(Y,W )

Cov(X,Y ) = Cov(Y,X)

ρ(X,Y ) = ρ(Y,X)

In particular,

Var(aX + b) = a2 Var(X)Cov(aX + b, cY + d) = acCov(X,Y )ρ(aX + b, cY + d) = ρ(X,Y )

For independent X and Y ,

Cov(X,Y ) = 0Var(X + Y ) = Var(X) + Var(Y )

(3.7)

52 Probability and Statistics for Computer Scientists

!

"

X

Y

!

"

X

Y

ρ = 1 ρ = −1

FIGURE 3.5: Perfect correlation: ρ = ±1.

How do we interpret the value of ρ? What possible values can it take?

As a special case of famous Cauchy-Schwarz inequality,

−1 ≤ ρ ≤ 1,

where |ρ| = 1 is possible only when all values of X and Y lie on a straight line, as inFigure 3.5. Further, values of ρ near 1 indicate strong positive correlation, values near (−1)show strong negative correlation, and values near 0 show weak correlation or no correlation.

3.3.6 Properties

The following properties of variances, covariances, and correlation coefficients hold for anyrandom variables X , Y , Z, and W and any non-random numbers a, b, c and d.

Properties of variances and covariances

Var(aX + bY + c) = a2 Var(X) + b2 Var(Y ) + 2abCov(X,Y )

Cov(aX + bY, cZ + dW )= acCov(X,Z) + adCov(X,W ) + bcCov(Y, Z) + bdCov(Y,W )

Cov(X,Y ) = Cov(Y,X)

ρ(X,Y ) = ρ(Y,X)

In particular,

Var(aX + b) = a2 Var(X)Cov(aX + b, cY + d) = acCov(X,Y )ρ(aX + b, cY + d) = ρ(X,Y )

For independent X and Y ,

Cov(X,Y ) = 0Var(X + Y ) = Var(X) + Var(Y )

(3.7)

Page 10: Discrete Distributions - Kennesaw State University

200 Chapter 5 Distributions of Functions of Random Variables

5.6 THE CENTRAL LIMIT THEOREMIn Section 5.4, we found that the mean X of a random sample of size n from a dis-tribution with mean µ and variance σ 2 > 0 is a random variable with the propertiesthat

E(X) = µ and Var(X) = σ 2

n.

As n increases, the variance of X decreases. Consequently, the distribution of Xclearly depends on n, and we see that we are dealing with sequences of distributions.

In Theorem 5.5-1, we considered the pdf of X when sampling is from the normaldistribution N(µ, σ 2). We showed that the distribution of X is N(µ, σ 2/n), and inFigure 5.5-1, by graphing the pdfs for several values of n, we illustrated the propertythat as n increases, the probability becomes concentrated in a small interval centeredat µ. That is, as n increases, X tends to converge to µ, or ( X − µ) tends to convergeto 0 in a probability sense. (See Section 5.8.)

In general, if we let

W =√

( X − µ) = X − µ

σ/√

n= Y − nµ√

n σ,

where Y is the sum of a random sample of size n from some distribution with meanµ and variance σ 2, then, for each positive integer n,

E(W) = E

[X − µ

σ/√

n

]

= E(X) − µ

σ/√

n= µ − µ

σ/√

n= 0

and

Var(W) = E(W2) = E

[(X − µ)2

σ 2/n

]

=E

[(X − µ)2

]

σ 2/n= σ 2/n

σ 2/n= 1.

Thus, while X−µ tends to “degenerate” to zero, the factor√

n/σ in√

n(X−µ)/σ“spreads out” the probability enough to prevent this degeneration. What, then, is thedistribution of W as n increases? One observation that might shed some light on theanswer to this question can be made immediately. If the sample arises from a normaldistribution, then, from Theorem 5.5-1, we know that X is N(µ, σ 2/n), and hence Wis N(0, 1) for each positive n. Thus, in the limit, the distribution of W must be N(0, 1).So if the solution of the question does not depend on the underlying distribution (i.e.,it is unique), the answer must be N(0, 1). As we will see, that is exactly the case, andthis result is so important that it is called the central limit theorem, the proof ofwhich is given in Section 5.9.

Theorem5.6-1

(Central Limit Theorem) If X is the mean of a random sample X1, X2, . . . , Xn ofsize n from a distribution with a finite mean µ and a finite positive variance σ 2,then the distribution of

W = X − µ

σ/√

n=

∑ni=1 Xi − nµ√

n σ

is N(0, 1) in the limit as n → ∞.

Section 5.8 Chebyshev’s Inequality and Convergence in Probability 213

g(u) =

6

(324u5

5

)

, 0 < u < 1/6,

6(

120

− 324u5 + 324u4 − 108u3 + 18u2 − 3u2

), 1/6 ≤ u < 2/6,

6(

−7920

+ 117u2

+ 648u5 − 1296u4 + 972u3 − 342u2)

, 2/6 ≤ u < 3/6,

6(

73120

− 693u2

− 648u5 + 1944u4 − 2268u3)

, 3/6 ≤ u < 4/6,

6(−1829

20+ 1227u

2− 1602u2 + 2052u3 + 324u5 − 1296u4

), 4/6 ≤ u < 5/6,

6

(324

5− 324u + 648u2 − 648u3 + 324u4 − 324u5

5

)

, 5/6 ≤ u < 1.

We can also calculate

∫ 2/6

1/6g(u) du = 19

240= 0.0792

and

∫ 1

11/18g(u) du = 5, 818

32, 805= 0.17735.

Although these integrations are not difficult, they are tedious to do by hand. !

5.8 CHEBYSHEV’S INEQUALITY AND CONVERGENCE IN PROBABILITYIn this section, we use Chebyshev’s inequality to show, in another sense, that thesample mean, X, is a good statistic to use to estimate a population mean µ; therelative frequency of success in n independent Bernoulli trials, Y/n, is a good statisticfor estimating p. We examine the effect of the sample size n on these estimates.

We begin by showing that Chebyshev’s inequality gives added significance tothe standard deviation in terms of bounding certain probabilities. The inequality isvalid for all distributions for which the standard deviation exists. The proof is givenfor the discrete case, but it holds for the continuous case, with integrals replacingsummations.

Theorem5.8-1

(Chebyshev’s Inequality) If the random variable X has a mean µ and variance σ 2,then, for every k ≥ 1,

P(|X − µ| ≥ kσ ) ≤ 1k2 .

Proof Let f (x) denote the pmf of X. Then

214 Chapter 5 Distributions of Functions of Random Variables

σ 2 = E[(X − µ)2] =∑

x∈S

(x − µ)2f (x)

=∑

x∈A

(x − µ)2f (x) +∑

x∈A′(x − µ)2f (x), (5.8-1)

where

A = {x : |x − µ| ≥ kσ }.

The second term in the right-hand member of Equation 5.8-1 is the sum of non-negative numbers and thus is greater than or equal to zero. Hence,

σ 2 ≥∑

x∈A

(x − µ)2f (x).

However, in A, |x − µ| ≥ kσ ; so

σ 2 ≥∑

x∈A

(kσ )2f (x) = k2σ 2∑

x∈A

f (x).

But the latter summation equals P(X ∈ A); thus,

σ 2 ≥ k2σ 2P(X ∈ A) = k2σ 2P(|X − µ| ≥ kσ ).

That is,

P(|X − µ| ≥ kσ ) ≤ 1k2 . !

Corollary5.8-1

If ε = kσ , then

P(|X − µ| ≥ ε) ≤ σ 2

ε2 .

"

In words, Chebyshev’s inequality states that the probability that X differs fromits mean by at least k standard deviations is less than or equal to 1/k2. It follows thatthe probability that X differs from its mean by less than k standard deviations is atleast 1 − 1/k2. That is,

P(|X − µ| < kσ ) ≥ 1 − 1k2 .

From the corollary, it also follows that

P(|X − µ| < ε) ≥ 1 − σ 2

ε2 .

Thus, Chebyshev’s inequality can be used as a bound for certain probabilities.However, in many instances, the bound is not very close to the true probability.

Example5.8-1

If it is known that X has a mean of 25 and a variance of 16, then, since σ = 4, a lowerbound for P(17 < X < 33) is given by

214 Chapter 5 Distributions of Functions of Random Variables

σ 2 = E[(X − µ)2] =∑

x∈S

(x − µ)2f (x)

=∑

x∈A

(x − µ)2f (x) +∑

x∈A′(x − µ)2f (x), (5.8-1)

where

A = {x : |x − µ| ≥ kσ }.The second term in the right-hand member of Equation 5.8-1 is the sum of non-negative numbers and thus is greater than or equal to zero. Hence,

σ 2 ≥∑

x∈A

(x − µ)2f (x).

However, in A, |x − µ| ≥ kσ ; so

σ 2 ≥∑

x∈A

(kσ )2f (x) = k2σ 2∑

x∈A

f (x).

But the latter summation equals P(X ∈ A); thus,

σ 2 ≥ k2σ 2P(X ∈ A) = k2σ 2P(|X − µ| ≥ kσ ).

That is,

P(|X − µ| ≥ kσ ) ≤ 1k2 . !

Corollary5.8-1

If ε = kσ , then

P(|X − µ| ≥ ε) ≤ σ 2

ε2 .

"

In words, Chebyshev’s inequality states that the probability that X differs fromits mean by at least k standard deviations is less than or equal to 1/k2. It follows thatthe probability that X differs from its mean by less than k standard deviations is atleast 1 − 1/k2. That is,

P(|X − µ| < kσ ) ≥ 1 − 1k2 .

From the corollary, it also follows that

P(|X − µ| < ε) ≥ 1 − σ 2

ε2 .

Thus, Chebyshev’s inequality can be used as a bound for certain probabilities.However, in many instances, the bound is not very close to the true probability.

Example5.8-1

If it is known that X has a mean of 25 and a variance of 16, then, since σ = 4, a lowerbound for P(17 < X < 33) is given by

214 Chapter 5 Distributions of Functions of Random Variables

σ 2 = E[(X − µ)2] =∑

x∈S

(x − µ)2f (x)

=∑

x∈A

(x − µ)2f (x) +∑

x∈A′(x − µ)2f (x), (5.8-1)

where

A = {x : |x − µ| ≥ kσ }.The second term in the right-hand member of Equation 5.8-1 is the sum of non-negative numbers and thus is greater than or equal to zero. Hence,

σ 2 ≥∑

x∈A

(x − µ)2f (x).

However, in A, |x − µ| ≥ kσ ; so

σ 2 ≥∑

x∈A

(kσ )2f (x) = k2σ 2∑

x∈A

f (x).

But the latter summation equals P(X ∈ A); thus,

σ 2 ≥ k2σ 2P(X ∈ A) = k2σ 2P(|X − µ| ≥ kσ ).

That is,

P(|X − µ| ≥ kσ ) ≤ 1k2 . !

Corollary5.8-1

If ε = kσ , then

P(|X − µ| ≥ ε) ≤ σ 2

ε2 .

"

In words, Chebyshev’s inequality states that the probability that X differs fromits mean by at least k standard deviations is less than or equal to 1/k2. It follows thatthe probability that X differs from its mean by less than k standard deviations is atleast 1 − 1/k2. That is,

P(|X − µ| < kσ ) ≥ 1 − 1k2 .

From the corollary, it also follows that

P(|X − µ| < ε) ≥ 1 − σ 2

ε2 .

Thus, Chebyshev’s inequality can be used as a bound for certain probabilities.However, in many instances, the bound is not very close to the true probability.

Example5.8-1

If it is known that X has a mean of 25 and a variance of 16, then, since σ = 4, a lowerbound for P(17 < X < 33) is given by

Page 11: Discrete Distributions - Kennesaw State University

Chapte rChapte r

4Bivariate Distributions

4.1 Bivariate Distributions of the Discrete Type4.2 The Correlation Coefficient4.3 Conditional Distributions

4.4 Bivariate Distributions of the ContinuousType

4.5 The Bivariate Normal Distribution

4.1 BIVARIATE DISTRIBUTIONS OF THE DISCRETE TYPESo far, we have taken only one measurement on a single item under observation.However, it is clear in many practical cases that it is possible, and often very desir-able, to take more than one measurement of a random observation. Suppose, forexample, that we are observing female college students to obtain information aboutsome of their physical characteristics, such as height, x, and weight, y, because we aretrying to determine a relationship between those two characteristics. For instance,there may be some pattern between height and weight that can be described byan appropriate curve y = u(x). Certainly, not all of the points observed will beon this curve, but we want to attempt to find the “best” curve to describe therelationship and then say something about the variation of the points around thecurve.

Another example might concern high school rank—say, x—and the ACT(or SAT) score—say, y—of incoming college students. What is the relationshipbetween these two characteristics? More importantly, how can we use those mea-surements to predict a third one, such as first-year college GPA—say, z—witha function z = v(x, y)? This is a very important problem for college admissionoffices, particularly when it comes to awarding an athletic scholarship, because theincoming student–athlete must satisfy certain conditions before receiving such anaward.

Definition 4.1-1Let X and Y be two random variables defined on a discrete space. Let S denotethe corresponding two-dimensional space of X and Y, the two random vari-ables of the discrete type. The probability that X = x and Y = y is denoted byf (x, y) = P(X = x, Y = y). The function f (x, y) is called the joint probabilitymass function (joint pmf) of X and Y and has the following properties:

125

126 Chapter 4 Bivariate Distributions

(a) 0 ≤ f (x, y) ≤ 1.

(b)∑ ∑

(x,y)∈S

f (x, y) = 1.

(c) P[(X, Y) ∈ A] =∑ ∑

(x,y)∈A

f (x, y), where A is a subset of the space S.

The following example will make this definition more meaningful.

Example4.1-1

Roll a pair of fair dice. For each of the 36 sample points with probability 1/36, letX denote the smaller and Y the larger outcome on the dice. For example, if theoutcome is (3, 2), then the observed values are X = 2, Y = 3. The event {X = 2,Y = 3} could occur in one of two ways—(3, 2) or (2, 3)—so its probability is

136

+ 136

= 236

.

If the outcome is (2, 2), then the observed values are X = 2, Y = 2. Since the event{X = 2, Y = 2} can occur in only one way, P(X = 2, Y = 2) = 1/36. The joint pmfof X and Y is given by the probabilities

f (x, y) =

136

, 1 ≤ x = y ≤ 6,

236

, 1 ≤ x < y ≤ 6,

when x and y are integers. Figure 4.1-1 depicts the probabilities of the various pointsof the space S.

1/36

2/36 1/36

1/36

2/36

1/36

2/36

2/36

2/36

1/36

2/36

2/36

2/36

5/36 3/367/36

y

2/36

2/36

2/36

2/36

1/36

1/36

x

11/36

2/36

2/36

2/36

5/36

9/36

1/36

11/364 53 621

7/36

3/36

9/36

6

5

4

3

2

1

Figure 4.1-1 Discrete joint pmf

Section 4.1 Bivariate Distributions of the Discrete Type 127

Notice that certain numbers have been recorded in the bottom and left-handmargins of Figure 4.1-1. These numbers are the respective column and row totalsof the probabilities. The column totals are the respective probabilities that X willassume the values in the x space SX = {1, 2, 3, 4, 5, 6}, and the row totals arethe respective probabilities that Y will assume the values in the y space SY ={1, 2, 3, 4, 5, 6}. That is, the totals describe the probability mass functions of X andY, respectively. Since each collection of these probabilities is frequently recordedin the margins and satisfies the properties of a pmf of one random variable, each iscalled a marginal pmf.

Definition 4.1-2Let X and Y have the joint probability mass function f (x, y) with space S. Theprobability mass function of X alone, which is called the marginal probabilitymass function of X, is defined by

fX(x) =∑

yf (x, y) = P(X = x), x ∈ SX ,

where the summation is taken over all possible y values for each given x in thex space SX . That is, the summation is over all (x, y) in S with a given x value.Similarly, the marginal probability mass function of Y is defined by

fY(y) =∑

xf (x, y) = P(Y = y), y ∈ SY ,

where the summation is taken over all possible x values for each given y in they space SY . The random variables X and Y are independent if and only if, forevery x ∈ SX and every y ∈ SY ,

P(X = x, Y = y) = P(X = x)P(Y = y)

or, equivalently,

f (x, y) = fX(x)fY(y);

otherwise, X and Y are said to be dependent.

We note in Example 4.1-1 that X and Y are dependent because there are manyx and y values for which f (x, y) "= fX(x)fY(y). For instance,

fX(1)fY(1) =(

1136

)(1

36

)"= 1

36= f (1, 1).

Example4.1-2

Let the joint pmf of X and Y be defined by

f (x, y) = x + y21

, x = 1, 2, 3, y = 1, 2.

Then

fX(x) =∑

yf (x, y) =

2∑

y=1

x + y21

= x + 121

+ x + 221

= 2x + 321

, x = 1, 2, 3,

Page 12: Discrete Distributions - Kennesaw State University

48 Probability and Statistics for Computer Scientists

(a) E(X) = 0.5

!0 0.5 1

(b) E(X) = 0.25

!0 0.25 0.5 1

FIGURE 3.3: Expectation as a center of gravity.

Similar arguments can be used to derive the general formula for the expectation.

Expectation,discrete case

µ = E(X) =∑

x

xP (x) (3.3)

This formula returns the center of gravity for a system with masses P (x) allocated at pointsx. Expected value is often denoted by a Greek letter µ.

In a certain sense, expectation is the best forecast of X . The variable itself is random. Ittakes different values with different probabilities P (x). At the same time, it has just oneexpectation E(X) which is non-random.

3.3.2 Expectation of a function

Often we are interested in another variable, Y , that is a function of X . For example, down-loading time depends on the connection speed, profit of a computer store depends on thenumber of computers sold, and bonus of its manager depends on this profit. Expectation ofY = g(X) is computed by a similar formula,

E {g(X)} =∑

x

g(x)P (x). (3.4)

Remark: Indeed, if g is a one-to-one function, then Y takes each value y = g(x) with probability

P (x), and the formula for E(Y ) can be applied directly. If g is not one-to-one, then some values ofg(x) will be repeated in (3.4). However, they are still multiplied by the corresponding probabilities.

When we add in (3.4), these probabilities are also added, thus each value of g(x) is still multipliedby the probability PY (g(x)).

3.3.3 Properties

The following linear properties of expectations follow directly from (3.3) and (3.4). For anyrandom variables X and Y and any non-random numbers a, b, and c, we have

50 Probability and Statistics for Computer Scientists

Example 3.10. Here is a rather artificial but illustrative scenario. Consider two users.One receives either 48 or 52 e-mail messages per day, with a 50-50% chance of each. Theother receives either 0 or 100 e-mails, also with a 50-50% chance. What is a common featureof these two distributions, and how are they different?

We see that both users receive the same average number of e-mails:

E(X) = E(Y ) = 50.

However, in the first case, the actual number of e-mails is always close to 50, whereas italways differs from it by 50 in the second case. The first random variable, X , is more stable;it has low variability. The second variable, Y , has high variability. ♦

This example shows that variability of a random variable is measured by its distance fromthe mean µ = E(X). In its turn, this distance is random too, and therefore, cannot serveas a characteristic of a distribution. It remains to square it and take the expectation of theresult.

DEFINITION 3.6

Variance of a random variable is defined as the expected squared deviationfrom the mean. For discrete random variables, variance is

σ2 = Var(X) = E (X − EX)2 =∑

x

(x− µ)2P (x)

Remark: Notice that if the distance to the mean is not squared, then the result is always µ−µ = 0bearing no information about the distribution of X.

According to this definition, variance is always non-negative. Further, it equals 0 only ifx = µ for all values of x, i.e., when X is constantly equal to µ. Certainly, a constant(non-random) variable has zero variability.

Variance can also be computed as

Var(X) = E(X2)− µ2. (3.6)

A proof of this is left as Exercise 3.38a.

DEFINITION 3.7

Standard deviation is a square root of variance,

σ = Std(X) =√

Var(X)

Continuing the Greek-letter tradition, variance is often denoted by σ2. Then, standarddeviation is σ.

If X is measured in some units, then its mean µ has the same measurement unit as X .Variance σ2 is measured in squared units, and therefore, it cannot be compared with X orµ. No matter how funny it sounds, it is rather normal to measure variance of profit in squareddollars, variance of class enrollment in squared students, and variance of available disk spacein squared gigabytes. When a squared root is taken, the resulting standard deviation σ isagain measured in the same units as X . This is the main reason of introducing yet anothermeasure of variability, σ.

Discrete Random Variables and Their Distributions 49

Propertiesof

expectations

E(aX + bY + c) = aE(X) + bE(Y ) + c

In particular,E(X + Y ) = E(X) + E(Y )E(aX) = aE(X)E(c) = c

For independent X and Y ,E(XY ) = E(X)E(Y )

(3.5)

Proof: The first property follows from the Addition Rule (3.2). For any X and Y ,

E(aX + bY + c) =∑

x

y

(ax+ by + c)P(X,Y )(x, y)

=∑

x

ax∑

y

P(X,Y )(x, y) +∑

y

by∑

x

P(X,Y )(x, y) + c∑

x

y

P(X,Y )(x, y)

= a∑

x

xPX(x) + b∑

y

yPY (y) + c.

The next three equalities are special cases. To prove the last property, we recall that P(X,Y )(x, y) =

PX(x)PY (y) for independent X and Y , and therefore,

E(XY ) =∑

x

y

(xy)PX(x)PY (y) =∑

x

xPX(x)∑

y

yPY (y) = E(X)E(Y ). !

Remark: The last property in (3.5) holds for some dependent variables too, hence it cannot be

used to verify independence of X and Y .

Example 3.9. In Example 3.6 on p. 46,

E(X) = (0)(0.5) + (1)(0.5) = 0.5 and

E(Y ) = (0)(0.4) + (1)(0.3) + (2)(0.15) + (3)(0.15) = 1.05,

therefore, the expected total number of errors is

E(X + Y ) = 0.5 + 1.05 = 1.65.

Remark: Clearly, the program will never have 1.65 errors, because the number of errors is alwaysinteger. Then, should we round 1.65 to 2 errors? Absolutely not, it would be a mistake. Although

both X and Y are integers, their expectations, or average values, do not have to be integers at all.

3.3.4 Variance and standard deviation

Expectation shows where the average value of a random variable is located, or where thevariable is expected to be, plus or minus some error. How large could this “error” be, andhow much can a variable vary around its expectation? Let us introduce some measures ofvariability.

50 Probability and Statistics for Computer Scientists

Example 3.10. Here is a rather artificial but illustrative scenario. Consider two users.One receives either 48 or 52 e-mail messages per day, with a 50-50% chance of each. Theother receives either 0 or 100 e-mails, also with a 50-50% chance. What is a common featureof these two distributions, and how are they different?

We see that both users receive the same average number of e-mails:

E(X) = E(Y ) = 50.

However, in the first case, the actual number of e-mails is always close to 50, whereas italways differs from it by 50 in the second case. The first random variable, X , is more stable;it has low variability. The second variable, Y , has high variability. ♦

This example shows that variability of a random variable is measured by its distance fromthe mean µ = E(X). In its turn, this distance is random too, and therefore, cannot serveas a characteristic of a distribution. It remains to square it and take the expectation of theresult.

DEFINITION 3.6

Variance of a random variable is defined as the expected squared deviationfrom the mean. For discrete random variables, variance is

σ2 = Var(X) = E (X − EX)2 =∑

x

(x− µ)2P (x)

Remark: Notice that if the distance to the mean is not squared, then the result is always µ−µ = 0bearing no information about the distribution of X.

According to this definition, variance is always non-negative. Further, it equals 0 only ifx = µ for all values of x, i.e., when X is constantly equal to µ. Certainly, a constant(non-random) variable has zero variability.

Variance can also be computed as

Var(X) = E(X2)− µ2. (3.6)

A proof of this is left as Exercise 3.38a.

DEFINITION 3.7

Standard deviation is a square root of variance,

σ = Std(X) =√

Var(X)

Continuing the Greek-letter tradition, variance is often denoted by σ2. Then, standarddeviation is σ.

If X is measured in some units, then its mean µ has the same measurement unit as X .Variance σ2 is measured in squared units, and therefore, it cannot be compared with X orµ. No matter how funny it sounds, it is rather normal to measure variance of profit in squareddollars, variance of class enrollment in squared students, and variance of available disk spacein squared gigabytes. When a squared root is taken, the resulting standard deviation σ isagain measured in the same units as X . This is the main reason of introducing yet anothermeasure of variability, σ.

50 Probability and Statistics for Computer Scientists

Example 3.10. Here is a rather artificial but illustrative scenario. Consider two users.One receives either 48 or 52 e-mail messages per day, with a 50-50% chance of each. Theother receives either 0 or 100 e-mails, also with a 50-50% chance. What is a common featureof these two distributions, and how are they different?

We see that both users receive the same average number of e-mails:

E(X) = E(Y ) = 50.

However, in the first case, the actual number of e-mails is always close to 50, whereas italways differs from it by 50 in the second case. The first random variable, X , is more stable;it has low variability. The second variable, Y , has high variability. ♦

This example shows that variability of a random variable is measured by its distance fromthe mean µ = E(X). In its turn, this distance is random too, and therefore, cannot serveas a characteristic of a distribution. It remains to square it and take the expectation of theresult.

DEFINITION 3.6

Variance of a random variable is defined as the expected squared deviationfrom the mean. For discrete random variables, variance is

σ2 = Var(X) = E (X − EX)2 =∑

x

(x− µ)2P (x)

Remark: Notice that if the distance to the mean is not squared, then the result is always µ−µ = 0bearing no information about the distribution of X.

According to this definition, variance is always non-negative. Further, it equals 0 only ifx = µ for all values of x, i.e., when X is constantly equal to µ. Certainly, a constant(non-random) variable has zero variability.

Variance can also be computed as

Var(X) = E(X2)− µ2. (3.6)

A proof of this is left as Exercise 3.38a.

DEFINITION 3.7

Standard deviation is a square root of variance,

σ = Std(X) =√

Var(X)

Continuing the Greek-letter tradition, variance is often denoted by σ2. Then, standarddeviation is σ.

If X is measured in some units, then its mean µ has the same measurement unit as X .Variance σ2 is measured in squared units, and therefore, it cannot be compared with X orµ. No matter how funny it sounds, it is rather normal to measure variance of profit in squareddollars, variance of class enrollment in squared students, and variance of available disk spacein squared gigabytes. When a squared root is taken, the resulting standard deviation σ isagain measured in the same units as X . This is the main reason of introducing yet anothermeasure of variability, σ.

Discrete Random Variables and Their Distributions 51

3.3.5 Covariance and correlation

Expectation, variance, and standard deviation characterize the distribution of a single ran-dom variable. Now we introduce measures of association of two random variables.

!

"

!

"

!

"

Y

X

Y

X

Y

X

(a) Cov(X,Y ) > 0 (b) Cov(X,Y ) < 0 (c) Cov(X,Y ) = 0

FIGURE 3.4: Positive, negative, and zero covariance.

DEFINITION 3.8

Covariance σXY = Cov(X,Y ) is defined as

Cov(X,Y ) = E {(X − EX)(Y − EY )}= E(XY )− E(X)E(Y )

It summarizes interrelation of two random variables.

Covariance is the expected product of deviations of X and Y from their respective expecta-tions. If Cov(X,Y ) > 0, then positive deviations (X− EX) are more likely to be multipliedby positive (Y − EY ), and negative (X− EX) are more likely to be multiplied by negative(Y − EY ). In short, large X imply large Y , and small X imply small Y . These variablesare positively correlated, Figure 3.4a.

Conversely, Cov(X,Y ) < 0 means that large X generally correspond to small Y and smallX correspond to large Y . These variables are negatively correlated, Figure 3.4b.

If Cov(X,Y ) = 0, we say that X and Y are uncorrelated, Figure 3.4c.

DEFINITION 3.9

Correlation coefficient between variables X and Y is defined as

ρ =Cov(X,Y )

( StdX)( StdY )

Correlation coefficient is a rescaled, normalized covariance. Notice that covarianceCov(X,Y ) has a measurement unit. It is measured in units of X multiplied by units ofY . As a result, it is not clear from its value whether X and Y are strongly or weakly corre-lated. Really, one has to compare Cov(X,Y ) with the magnitude of X and Y . Correlationcoefficient performs such a comparison, and as a result, it is dimensionless.

Discrete Random Variables and Their Distributions 51

3.3.5 Covariance and correlation

Expectation, variance, and standard deviation characterize the distribution of a single ran-dom variable. Now we introduce measures of association of two random variables.

!

"

!

"

!

"

Y

X

Y

X

Y

X

(a) Cov(X,Y ) > 0 (b) Cov(X,Y ) < 0 (c) Cov(X,Y ) = 0

FIGURE 3.4: Positive, negative, and zero covariance.

DEFINITION 3.8

Covariance σXY = Cov(X,Y ) is defined as

Cov(X,Y ) = E {(X − EX)(Y − EY )}= E(XY )− E(X)E(Y )

It summarizes interrelation of two random variables.

Covariance is the expected product of deviations of X and Y from their respective expecta-tions. If Cov(X,Y ) > 0, then positive deviations (X− EX) are more likely to be multipliedby positive (Y − EY ), and negative (X− EX) are more likely to be multiplied by negative(Y − EY ). In short, large X imply large Y , and small X imply small Y . These variablesare positively correlated, Figure 3.4a.

Conversely, Cov(X,Y ) < 0 means that large X generally correspond to small Y and smallX correspond to large Y . These variables are negatively correlated, Figure 3.4b.

If Cov(X,Y ) = 0, we say that X and Y are uncorrelated, Figure 3.4c.

DEFINITION 3.9

Correlation coefficient between variables X and Y is defined as

ρ =Cov(X,Y )

( StdX)( StdY )

Correlation coefficient is a rescaled, normalized covariance. Notice that covarianceCov(X,Y ) has a measurement unit. It is measured in units of X multiplied by units ofY . As a result, it is not clear from its value whether X and Y are strongly or weakly corre-lated. Really, one has to compare Cov(X,Y ) with the magnitude of X and Y . Correlationcoefficient performs such a comparison, and as a result, it is dimensionless.

Discrete Random Variables and Their Distributions 51

3.3.5 Covariance and correlation

Expectation, variance, and standard deviation characterize the distribution of a single ran-dom variable. Now we introduce measures of association of two random variables.

!

"

!

"

!

"

Y

X

Y

X

Y

X

(a) Cov(X,Y ) > 0 (b) Cov(X,Y ) < 0 (c) Cov(X,Y ) = 0

FIGURE 3.4: Positive, negative, and zero covariance.

DEFINITION 3.8

Covariance σXY = Cov(X,Y ) is defined as

Cov(X,Y ) = E {(X − EX)(Y − EY )}= E(XY )− E(X)E(Y )

It summarizes interrelation of two random variables.

Covariance is the expected product of deviations of X and Y from their respective expecta-tions. If Cov(X,Y ) > 0, then positive deviations (X− EX) are more likely to be multipliedby positive (Y − EY ), and negative (X− EX) are more likely to be multiplied by negative(Y − EY ). In short, large X imply large Y , and small X imply small Y . These variablesare positively correlated, Figure 3.4a.

Conversely, Cov(X,Y ) < 0 means that large X generally correspond to small Y and smallX correspond to large Y . These variables are negatively correlated, Figure 3.4b.

If Cov(X,Y ) = 0, we say that X and Y are uncorrelated, Figure 3.4c.

DEFINITION 3.9

Correlation coefficient between variables X and Y is defined as

ρ =Cov(X,Y )

( StdX)( StdY )

Correlation coefficient is a rescaled, normalized covariance. Notice that covarianceCov(X,Y ) has a measurement unit. It is measured in units of X multiplied by units ofY . As a result, it is not clear from its value whether X and Y are strongly or weakly corre-lated. Really, one has to compare Cov(X,Y ) with the magnitude of X and Y . Correlationcoefficient performs such a comparison, and as a result, it is dimensionless.

Page 13: Discrete Distributions - Kennesaw State University

52 Probability and Statistics for Computer Scientists

!

"

X

Y

!

"

X

Y

ρ = 1 ρ = −1

FIGURE 3.5: Perfect correlation: ρ = ±1.

How do we interpret the value of ρ? What possible values can it take?

As a special case of famous Cauchy-Schwarz inequality,

−1 ≤ ρ ≤ 1,

where |ρ| = 1 is possible only when all values of X and Y lie on a straight line, as inFigure 3.5. Further, values of ρ near 1 indicate strong positive correlation, values near (−1)show strong negative correlation, and values near 0 show weak correlation or no correlation.

3.3.6 Properties

The following properties of variances, covariances, and correlation coefficients hold for anyrandom variables X , Y , Z, and W and any non-random numbers a, b, c and d.

Properties of variances and covariances

Var(aX + bY + c) = a2 Var(X) + b2 Var(Y ) + 2abCov(X,Y )

Cov(aX + bY, cZ + dW )= acCov(X,Z) + adCov(X,W ) + bcCov(Y, Z) + bdCov(Y,W )

Cov(X,Y ) = Cov(Y,X)

ρ(X,Y ) = ρ(Y,X)

In particular,

Var(aX + b) = a2 Var(X)Cov(aX + b, cY + d) = acCov(X,Y )ρ(aX + b, cY + d) = ρ(X,Y )

For independent X and Y ,

Cov(X,Y ) = 0Var(X + Y ) = Var(X) + Var(Y )

(3.7)

52 Probability and Statistics for Computer Scientists

!

"

X

Y

!

"

X

Y

ρ = 1 ρ = −1

FIGURE 3.5: Perfect correlation: ρ = ±1.

How do we interpret the value of ρ? What possible values can it take?

As a special case of famous Cauchy-Schwarz inequality,

−1 ≤ ρ ≤ 1,

where |ρ| = 1 is possible only when all values of X and Y lie on a straight line, as inFigure 3.5. Further, values of ρ near 1 indicate strong positive correlation, values near (−1)show strong negative correlation, and values near 0 show weak correlation or no correlation.

3.3.6 Properties

The following properties of variances, covariances, and correlation coefficients hold for anyrandom variables X , Y , Z, and W and any non-random numbers a, b, c and d.

Properties of variances and covariances

Var(aX + bY + c) = a2 Var(X) + b2 Var(Y ) + 2abCov(X,Y )

Cov(aX + bY, cZ + dW )= acCov(X,Z) + adCov(X,W ) + bcCov(Y, Z) + bdCov(Y,W )

Cov(X,Y ) = Cov(Y,X)

ρ(X,Y ) = ρ(Y,X)

In particular,

Var(aX + b) = a2 Var(X)Cov(aX + b, cY + d) = acCov(X,Y )ρ(aX + b, cY + d) = ρ(X,Y )

For independent X and Y ,

Cov(X,Y ) = 0Var(X + Y ) = Var(X) + Var(Y )

(3.7)

Page 14: Discrete Distributions - Kennesaw State University

200 Chapter 5 Distributions of Functions of Random Variables

5.6 THE CENTRAL LIMIT THEOREMIn Section 5.4, we found that the mean X of a random sample of size n from a dis-tribution with mean µ and variance σ 2 > 0 is a random variable with the propertiesthat

E(X) = µ and Var(X) = σ 2

n.

As n increases, the variance of X decreases. Consequently, the distribution of Xclearly depends on n, and we see that we are dealing with sequences of distributions.

In Theorem 5.5-1, we considered the pdf of X when sampling is from the normaldistribution N(µ, σ 2). We showed that the distribution of X is N(µ, σ 2/n), and inFigure 5.5-1, by graphing the pdfs for several values of n, we illustrated the propertythat as n increases, the probability becomes concentrated in a small interval centeredat µ. That is, as n increases, X tends to converge to µ, or ( X − µ) tends to convergeto 0 in a probability sense. (See Section 5.8.)

In general, if we let

W =√

( X − µ) = X − µ

σ/√

n= Y − nµ√

n σ,

where Y is the sum of a random sample of size n from some distribution with meanµ and variance σ 2, then, for each positive integer n,

E(W) = E

[X − µ

σ/√

n

]

= E(X) − µ

σ/√

n= µ − µ

σ/√

n= 0

and

Var(W) = E(W2) = E

[(X − µ)2

σ 2/n

]

=E

[(X − µ)2

]

σ 2/n= σ 2/n

σ 2/n= 1.

Thus, while X−µ tends to “degenerate” to zero, the factor√

n/σ in√

n(X−µ)/σ“spreads out” the probability enough to prevent this degeneration. What, then, is thedistribution of W as n increases? One observation that might shed some light on theanswer to this question can be made immediately. If the sample arises from a normaldistribution, then, from Theorem 5.5-1, we know that X is N(µ, σ 2/n), and hence Wis N(0, 1) for each positive n. Thus, in the limit, the distribution of W must be N(0, 1).So if the solution of the question does not depend on the underlying distribution (i.e.,it is unique), the answer must be N(0, 1). As we will see, that is exactly the case, andthis result is so important that it is called the central limit theorem, the proof ofwhich is given in Section 5.9.

Theorem5.6-1

(Central Limit Theorem) If X is the mean of a random sample X1, X2, . . . , Xn ofsize n from a distribution with a finite mean µ and a finite positive variance σ 2,then the distribution of

W = X − µ

σ/√

n=

∑ni=1 Xi − nµ√

n σ

is N(0, 1) in the limit as n → ∞.

Section 5.8 Chebyshev’s Inequality and Convergence in Probability 213

g(u) =

6

(324u5

5

)

, 0 < u < 1/6,

6(

120

− 324u5 + 324u4 − 108u3 + 18u2 − 3u2

), 1/6 ≤ u < 2/6,

6(

−7920

+ 117u2

+ 648u5 − 1296u4 + 972u3 − 342u2)

, 2/6 ≤ u < 3/6,

6(

73120

− 693u2

− 648u5 + 1944u4 − 2268u3)

, 3/6 ≤ u < 4/6,

6(−1829

20+ 1227u

2− 1602u2 + 2052u3 + 324u5 − 1296u4

), 4/6 ≤ u < 5/6,

6

(324

5− 324u + 648u2 − 648u3 + 324u4 − 324u5

5

)

, 5/6 ≤ u < 1.

We can also calculate

∫ 2/6

1/6g(u) du = 19

240= 0.0792

and

∫ 1

11/18g(u) du = 5, 818

32, 805= 0.17735.

Although these integrations are not difficult, they are tedious to do by hand. !

5.8 CHEBYSHEV’S INEQUALITY AND CONVERGENCE IN PROBABILITYIn this section, we use Chebyshev’s inequality to show, in another sense, that thesample mean, X, is a good statistic to use to estimate a population mean µ; therelative frequency of success in n independent Bernoulli trials, Y/n, is a good statisticfor estimating p. We examine the effect of the sample size n on these estimates.

We begin by showing that Chebyshev’s inequality gives added significance tothe standard deviation in terms of bounding certain probabilities. The inequality isvalid for all distributions for which the standard deviation exists. The proof is givenfor the discrete case, but it holds for the continuous case, with integrals replacingsummations.

Theorem5.8-1

(Chebyshev’s Inequality) If the random variable X has a mean µ and variance σ 2,then, for every k ≥ 1,

P(|X − µ| ≥ kσ ) ≤ 1k2 .

Proof Let f (x) denote the pmf of X. Then

214 Chapter 5 Distributions of Functions of Random Variables

σ 2 = E[(X − µ)2] =∑

x∈S

(x − µ)2f (x)

=∑

x∈A

(x − µ)2f (x) +∑

x∈A′(x − µ)2f (x), (5.8-1)

where

A = {x : |x − µ| ≥ kσ }.

The second term in the right-hand member of Equation 5.8-1 is the sum of non-negative numbers and thus is greater than or equal to zero. Hence,

σ 2 ≥∑

x∈A

(x − µ)2f (x).

However, in A, |x − µ| ≥ kσ ; so

σ 2 ≥∑

x∈A

(kσ )2f (x) = k2σ 2∑

x∈A

f (x).

But the latter summation equals P(X ∈ A); thus,

σ 2 ≥ k2σ 2P(X ∈ A) = k2σ 2P(|X − µ| ≥ kσ ).

That is,

P(|X − µ| ≥ kσ ) ≤ 1k2 . !

Corollary5.8-1

If ε = kσ , then

P(|X − µ| ≥ ε) ≤ σ 2

ε2 .

"

In words, Chebyshev’s inequality states that the probability that X differs fromits mean by at least k standard deviations is less than or equal to 1/k2. It follows thatthe probability that X differs from its mean by less than k standard deviations is atleast 1 − 1/k2. That is,

P(|X − µ| < kσ ) ≥ 1 − 1k2 .

From the corollary, it also follows that

P(|X − µ| < ε) ≥ 1 − σ 2

ε2 .

Thus, Chebyshev’s inequality can be used as a bound for certain probabilities.However, in many instances, the bound is not very close to the true probability.

Example5.8-1

If it is known that X has a mean of 25 and a variance of 16, then, since σ = 4, a lowerbound for P(17 < X < 33) is given by

214 Chapter 5 Distributions of Functions of Random Variables

σ 2 = E[(X − µ)2] =∑

x∈S

(x − µ)2f (x)

=∑

x∈A

(x − µ)2f (x) +∑

x∈A′(x − µ)2f (x), (5.8-1)

where

A = {x : |x − µ| ≥ kσ }.The second term in the right-hand member of Equation 5.8-1 is the sum of non-negative numbers and thus is greater than or equal to zero. Hence,

σ 2 ≥∑

x∈A

(x − µ)2f (x).

However, in A, |x − µ| ≥ kσ ; so

σ 2 ≥∑

x∈A

(kσ )2f (x) = k2σ 2∑

x∈A

f (x).

But the latter summation equals P(X ∈ A); thus,

σ 2 ≥ k2σ 2P(X ∈ A) = k2σ 2P(|X − µ| ≥ kσ ).

That is,

P(|X − µ| ≥ kσ ) ≤ 1k2 . !

Corollary5.8-1

If ε = kσ , then

P(|X − µ| ≥ ε) ≤ σ 2

ε2 .

"

In words, Chebyshev’s inequality states that the probability that X differs fromits mean by at least k standard deviations is less than or equal to 1/k2. It follows thatthe probability that X differs from its mean by less than k standard deviations is atleast 1 − 1/k2. That is,

P(|X − µ| < kσ ) ≥ 1 − 1k2 .

From the corollary, it also follows that

P(|X − µ| < ε) ≥ 1 − σ 2

ε2 .

Thus, Chebyshev’s inequality can be used as a bound for certain probabilities.However, in many instances, the bound is not very close to the true probability.

Example5.8-1

If it is known that X has a mean of 25 and a variance of 16, then, since σ = 4, a lowerbound for P(17 < X < 33) is given by

214 Chapter 5 Distributions of Functions of Random Variables

σ 2 = E[(X − µ)2] =∑

x∈S

(x − µ)2f (x)

=∑

x∈A

(x − µ)2f (x) +∑

x∈A′(x − µ)2f (x), (5.8-1)

where

A = {x : |x − µ| ≥ kσ }.The second term in the right-hand member of Equation 5.8-1 is the sum of non-negative numbers and thus is greater than or equal to zero. Hence,

σ 2 ≥∑

x∈A

(x − µ)2f (x).

However, in A, |x − µ| ≥ kσ ; so

σ 2 ≥∑

x∈A

(kσ )2f (x) = k2σ 2∑

x∈A

f (x).

But the latter summation equals P(X ∈ A); thus,

σ 2 ≥ k2σ 2P(X ∈ A) = k2σ 2P(|X − µ| ≥ kσ ).

That is,

P(|X − µ| ≥ kσ ) ≤ 1k2 . !

Corollary5.8-1

If ε = kσ , then

P(|X − µ| ≥ ε) ≤ σ 2

ε2 .

"

In words, Chebyshev’s inequality states that the probability that X differs fromits mean by at least k standard deviations is less than or equal to 1/k2. It follows thatthe probability that X differs from its mean by less than k standard deviations is atleast 1 − 1/k2. That is,

P(|X − µ| < kσ ) ≥ 1 − 1k2 .

From the corollary, it also follows that

P(|X − µ| < ε) ≥ 1 − σ 2

ε2 .

Thus, Chebyshev’s inequality can be used as a bound for certain probabilities.However, in many instances, the bound is not very close to the true probability.

Example5.8-1

If it is known that X has a mean of 25 and a variance of 16, then, since σ = 4, a lowerbound for P(17 < X < 33) is given by

Page 15: Discrete Distributions - Kennesaw State University

Section 6.2 Exploratory Data Analysis 241

Table 6.2-5 Order statistics of 50 exam scores

34 38 42 42 45 47 51 52 54 57

58 58 59 60 61 63 65 65 66 67

68 69 69 70 71 71 72 73 73 74

75 75 76 76 77 79 81 81 82 83

83 84 84 85 87 90 91 93 93 97

From either these order statistics or the corresponding ordered stem-and-leafdisplay, it is rather easy to find the sample percentiles. If 0 < p < 1, then the (100p)thsample percentile has approximately np sample observations less than it and alson(1−p) sample observations greater than it. One way of achieving this is to take the(100p)th sample percentile as the (n + 1)pth order statistic, provided that (n + 1)p isan integer. If (n + 1)p is not an integer but is equal to r plus some proper fraction—say, a/b—use a weighted average of the rth and the (r + 1)st order statistics. That is,define the (100p)th sample percentile as

πp = yr + (a/b)(yr+1 − yr) = (1 − a/b)yr + (a/b)yr+1.

Note that this formula is simply a linear interpolation between yr and yr+1.[If p < 1/(n + 1) or p > n/(n + 1), that sample percentile is not defined.]

As an illustration, consider the 50 ordered test scores. With p = 1/2, we findthe 50th percentile by averaging the 25th and 26th order statistics, since (n + 1)p =(51)(1/2) = 25.5. Thus, the 50th percentile is

π0.50 = (1/2)y25 + (1/2)y26 = (71 + 71)/2 = 71.

With p = 1/4, we have (n + 1)p = (51)(1/4) = 12.75, and the 25th sample percentile isthen

π0.25 = (1 − 0.75)y12 + (0.75)y13 = (0.25)(58) + (0.75)(59) = 58.75.

With p = 3/4, so that (n + 1)p = (51)(3/4) = 38.25, the 75th sample percentile is

π0.75 = (1 − 0.25)y38 + (0.25)y39 = (0.75)(81) + (0.25)(82) = 81.25.

Note that approximately 50%, 25%, and 75% of the sample observations are lessthan 71, 58.75, and 81.25, respectively.

Special names are given to certain percentiles. The 50th percentile is the medianof the sample. The 25th, 50th, and 75th percentiles are, respectively, the first, second,and third quartiles of the sample. For notation, we let q1 = π0.25, q2 = m = π0.50,and q3 = π0.75. The 10th, 20th, . . . , and 90th percentiles are the deciles of the sample,so note that the 50th percentile is also the median, the second quartile, and the fifthdecile. With the set of 50 test scores, since (51)(2/10) = 10.2 and (51)(9/10) = 45.9, thesecond and ninth deciles are, respectively,

π0.20 = (0.8)y10 + (0.2)y11 = (0.8)(57) + (0.2)(58) = 57.2

and

π0.90 = (0.1)y45 + (0.9)y46 = (0.1)(87) + (0.9)(90) = 89.7.

Regression 365

Predictor X Predictor X

Response

Y

Response

Y

(a) Observed pairs (b) Estimated regression line

(x1, y1)

(x2, y2)

(x3, y3)

(x4, y4)

(x5, y5)

y1

y1

y2

y2

y3

y3

y4

y4y5

y5

FIGURE 11.3: Least squares estimation of the regression line.

Function G is usually sought in a suitable form: linear, quadratic, logarithmic, etc. Thesimplest form is linear.

11.1.3 Linear regression

Linear regression model assumes that the conditional expectation

G(x) = E {Y | X = x} = β0 + β1 x

is a linear function of x. As any linear function, it has an intercept β0 and a slope β1.

The interceptβ0 = G(0)

equals the value of the regression function for x = 0. Sometimes it has no physical meaning.For example, nobody will try to predict the value of a computer with 0 random accessmemory (RAM), and nobody will consider the Federal reserve rate in year 0. In othercases, intercept is quite important. For example, according to the Ohm’s Law (V = RI)the voltage across an ideal conductor is proportional to the current. A non-zero intercept(V = V0 + R I) would show that the circuit is not ideal, and there is an external loss ofvoltage.

The slopeβ1 = G(x+ 1)−G(x)

is the predicted change in the response variable when predictor changes by 1. This is a veryimportant parameter that shows how fast we can change the expected response by varyingthe predictor. For example, customer satisfaction will increase by β1(∆x) when the qualityof produced computers increases by (∆x).

A zero slope means absence of a linear relationship between X and Y . In this case, Y isexpected to stay constant when X changes.

Regression 367

Regressionestimates

b0 = β0 = y − b1 x

b1 = β1 = Sxy/Sxx

where

Sxx =n∑

i=1

(xi − x)2

Sxy =n∑

i=1

(xi − x)(yi − y)

(11.6)

Example 11.3 (World population). In Example 11.1, xi is the year, and yi is the worldpopulation during that year. To estimate the regression line in Figure 11.1, we compute

x = 1980; y = 4558.1;

Sxx = (1950− x)2 + . . .+ (2010− x)2 = 4550;

Sxy = (1950− x)(2558− y) + . . .+ (2010− x)(6864− y) = 337250.

Then

b1 = Sxy/Sxx = 74.1

b0 = y − b1x = −142201.

The estimated regression line is

G(x) = b0 + b1 x = -142201 + 74.1x.

We conclude that the world population grows at the average rate of 74.1 million every year.

We can use the obtained equation to predict the future growth of the world population.Regression predictions for years 2015 and 2020 are

G(2015) = b0 + 2015 b1 = 7152 million people

G(2020) = b0 + 2020 b1 = 7523 million people

11.1.4 Regression and correlation

Recall from Section 3.3.5 that covariance

Cov(X,Y ) = E(X − E(X))(Y − E(Y ))

and correlation coefficient

ρ =Cov(X,Y )

( StdX)( StdY )

302 Chapter 7 Interval Estimation

Thus, since the probability of the first of these is 1 −α, the probability of the lastmust also be 1 − α, because the latter is true if and only if the former is true. That is,we have

P[

X − zα/2

(σ√n

)≤ µ ≤ X + zα/2

(σ√n

)]= 1 − α.

So the probability that the random interval[

X − zα/2

(σ√n

), X + zα/2

(σ√n

)]

includes the unknown mean µ is 1 − α.Once the sample is observed and the sample mean computed to equal x, the

interval [ x − zα/2(σ/√

n ), x + zα/2(σ/√

n )] becomes known. Since the probabilitythat the random interval covers µ before the sample is drawn is equal to 1 − α,we now call the computed interval, x ± zα/2(σ/

√n ) (for brevity), a 100(1 − α)%

confidence interval for the unknown mean µ. For example, x ± 1.96(σ/√

n ) is a 95%confidence interval for µ. The number 100(1 − α)%, or equivalently, 1 − α, is calledthe confidence coefficient.

We see that the confidence interval for µ is centered at the point estimate xand is completed by subtracting and adding the quantity zα/2(σ/

√n ). Note that as n

increases, zα/2(σ/√

n ) decreases, resulting in a shorter confidence interval with thesame confidence coefficient 1−α. A shorter confidence interval gives a more preciseestimate of µ, regardless of the confidence we have in the estimate of µ. Statisticianswho are not restricted by time, money, effort, or the availability of observations canobviously make the confidence interval as short as they like by increasing the samplesize n. For a fixed sample size n, the length of the confidence interval can also beshortened by decreasing the confidence coefficient 1 − α. But if this is done, weachieve a shorter confidence interval at the expense of losing some confidence.

Example7.1-1

Let X equal the length of life of a 60-watt light bulb marketed by a certain manufac-turer. Assume that the distribution of X is N(µ, 1296). If a random sample of n = 27bulbs is tested until they burn out, yielding a sample mean of x = 1478 hours, then a95% confidence interval for µ is[

x − z0.025

(σ√n

), x + z0.025

(σ√n

)]=

[1478 − 1.96

(36√27

), 1478 + 1.96

(36√27

)]

= [1478 − 13.58, 1478 + 13.58]

= [1464.42, 1491.58].

The next example will help to give a better intuitive feeling for the interpretationof a confidence interval.

Example7.1-2

Let x be the observed sample mean of five observations of a random sample fromthe normal distribution N(µ, 16). A 90% confidence interval for the unknown meanµ is

[

x − 1.645

√165

, x + 1.645

√165

]

.

302 Chapter 7 Interval Estimation

Thus, since the probability of the first of these is 1 −α, the probability of the lastmust also be 1 − α, because the latter is true if and only if the former is true. That is,we have

P[

X − zα/2

(σ√n

)≤ µ ≤ X + zα/2

(σ√n

)]= 1 − α.

So the probability that the random interval[

X − zα/2

(σ√n

), X + zα/2

(σ√n

)]

includes the unknown mean µ is 1 − α.Once the sample is observed and the sample mean computed to equal x, the

interval [ x − zα/2(σ/√

n ), x + zα/2(σ/√

n )] becomes known. Since the probabilitythat the random interval covers µ before the sample is drawn is equal to 1 − α,we now call the computed interval, x ± zα/2(σ/

√n ) (for brevity), a 100(1 − α)%

confidence interval for the unknown mean µ. For example, x ± 1.96(σ/√

n ) is a 95%confidence interval for µ. The number 100(1 − α)%, or equivalently, 1 − α, is calledthe confidence coefficient.

We see that the confidence interval for µ is centered at the point estimate xand is completed by subtracting and adding the quantity zα/2(σ/

√n ). Note that as n

increases, zα/2(σ/√

n ) decreases, resulting in a shorter confidence interval with thesame confidence coefficient 1−α. A shorter confidence interval gives a more preciseestimate of µ, regardless of the confidence we have in the estimate of µ. Statisticianswho are not restricted by time, money, effort, or the availability of observations canobviously make the confidence interval as short as they like by increasing the samplesize n. For a fixed sample size n, the length of the confidence interval can also beshortened by decreasing the confidence coefficient 1 − α. But if this is done, weachieve a shorter confidence interval at the expense of losing some confidence.

Example7.1-1

Let X equal the length of life of a 60-watt light bulb marketed by a certain manufac-turer. Assume that the distribution of X is N(µ, 1296). If a random sample of n = 27bulbs is tested until they burn out, yielding a sample mean of x = 1478 hours, then a95% confidence interval for µ is[

x − z0.025

(σ√n

), x + z0.025

(σ√n

)]=

[1478 − 1.96

(36√27

), 1478 + 1.96

(36√27

)]

= [1478 − 13.58, 1478 + 13.58]

= [1464.42, 1491.58].

The next example will help to give a better intuitive feeling for the interpretationof a confidence interval.

Example7.1-2

Let x be the observed sample mean of five observations of a random sample fromthe normal distribution N(µ, 16). A 90% confidence interval for the unknown meanµ is

[

x − 1.645

√165

, x + 1.645

√165

]

.

Section 7.1 Confidence Intervals for Means 305

1 − α = P

[

−tα/2(n−1) ≤ X − µ

S/√

n≤ tα/2(n−1)

]

= P[−tα/2(n−1)

(S√n

)≤ X − µ ≤ tα/2(n−1)

(S√n

)]

= P[−X − tα/2(n−1)

(S√n

)≤ −µ ≤ −X + tα/2(n−1)

(S√n

)]

= P[

X − tα/2(n−1)(

S√n

)≤ µ ≤ X + tα/2(n−1)

(S√n

)].

Thus, the observations of a random sample provide x and s2, and[

x − tα/2(n−1)(

s√n

), x + tα/2(n−1)

(s√n

)]

is a 100(1 − α)% confidence interval for µ.

Example7.1-5

Let X equal the amount of butterfat in pounds produced by a typical cow during a305-day milk production period between her first and second calves. Assume thatthe distribution of X is N(µ, σ 2). To estimate µ, a farmer measured the butterfatproduction for n = 20 cows and obtained the following data:

481 537 513 583 453 510 570 500 457 555

618 327 350 643 499 421 505 637 599 392

For these data, x = 507.50 and s = 89.75. Thus, a point estimate of µ is x = 507.50.Since t0.05(19) = 1.729, a 90% confidence interval for µ is

507.50 ± 1.729(

89.75√20

)or

507.50 ± 34.70, or equivalently, [472.80, 542.20].

Let T have a t distribution with n−1 degrees of freedom. Then tα/2(n−1) > zα/2.Consequently, we would expect the interval x ± zα/2(σ/

√n ) to be shorter than the

interval x± tα/2(n−1)(s/√

n ). After all, we have more information, namely, the valueof σ , in constructing the first interval. However, the length of the second intervalis very much dependent on the value of s. If the observed s is smaller than σ , ashorter confidence interval could result by the second procedure. But on the average,x ± zα/2(σ/

√n ) is the shorter of the two confidence intervals (Exercise 7.1-14).

Example7.1-6

In Example 7.1-2, 50 confidence intervals were simulated for the mean of a nor-mal distribution, assuming that the variance was known. For those same data, sincet0.05(4) = 2.132, x ± 2.132(s/

√5 ) was used to calculate a 90% confidence interval

for µ. For those particular 50 intervals, 46 contained the mean µ = 50. These 50intervals are depicted in Figure 7.1-1(b). Note the different lengths of the intervals.Some are longer and some are shorter than the corresponding z intervals. The aver-age length of the 50 t intervals is 7.137, which is quite close to the expected length ofsuch an interval: 7.169. (See Exercise 7.1-14.) The length of the intervals that use zand σ = 4 is 5.885.

310 Chapter 7 Interval Estimation

has a t distribution with n + m − 2 degrees of freedom. That is,

T =

X − Y − (µX − µY)√

σ 2/n + σ 2/m√√√√[

(n − 1)S2X

σ 2 + (m − 1)S2Y

σ 2

]/

(n + m − 2)

= X − Y − (µX − µY)√√√√

[(n − 1)S2

X + (m − 1)S2Y

n + m − 2

][1n

+ 1m

]

has a t distribution with r = n + m − 2 degrees of freedom. Thus, witht0 = tα/2(n+m−2), we have

P(−t0 ≤ T ≤ t0) = 1 − α.

Solving the inequalities for µX − µY yields

P

(

X − Y − t0SP

√1n

+ 1m

≤ µX − µY ≤ X − Y + t0SP

√1n

+ 1m

)

,

where the pooled estimator of the common standard deviation is

SP =√

(n − 1)S2X + (m − 1)S2

Y

n + m − 2.

If x, y, and sp are the observed values of X, Y, and SP, then

[

x − y − t0sp

√1n

+ 1m

, x − y + t0sp

√1n

+ 1m

]

is a 100(1 − α)% confidence interval for µX − µY .

Example7.2-2

Suppose that scores on a standardized test in mathematics taken by students fromlarge and small high schools are N(µX , σ 2) and N(µY , σ 2), respectively, where σ 2 isunknown. If a random sample of n = 9 students from large high schools yieldedx = 81.31, s2

x = 60.76, and a random sample of m = 15 students from small highschools yielded y = 78.61, s2

y = 48.24, then the endpoints for a 95% confidenceinterval for µX − µY are given by

81.31 − 78.61 ± 2.074

√8(60.76) + 14(48.24)

22

√19

+ 115

because t0.025(22) = 2.074. The 95% confidence interval is [−3.65, 9.05].

REMARKS The assumption of equal variances, namely, σ 2X = σ 2

Y , can be modifiedsomewhat so that we are still able to find a confidence interval for µX − µY . That is,if we know the ratio σ 2

X/σ 2Y of the variances, we can still make this type of statistical

310 Chapter 7 Interval Estimation

has a t distribution with n + m − 2 degrees of freedom. That is,

T =

X − Y − (µX − µY)√

σ 2/n + σ 2/m√√√√[

(n − 1)S2X

σ 2 + (m − 1)S2Y

σ 2

]/

(n + m − 2)

= X − Y − (µX − µY)√√√√

[(n − 1)S2

X + (m − 1)S2Y

n + m − 2

][1n

+ 1m

]

has a t distribution with r = n + m − 2 degrees of freedom. Thus, witht0 = tα/2(n+m−2), we have

P(−t0 ≤ T ≤ t0) = 1 − α.

Solving the inequalities for µX − µY yields

P

(

X − Y − t0SP

√1n

+ 1m

≤ µX − µY ≤ X − Y + t0SP

√1n

+ 1m

)

,

where the pooled estimator of the common standard deviation is

SP =√

(n − 1)S2X + (m − 1)S2

Y

n + m − 2.

If x, y, and sp are the observed values of X, Y, and SP, then

[

x − y − t0sp

√1n

+ 1m

, x − y + t0sp

√1n

+ 1m

]

is a 100(1 − α)% confidence interval for µX − µY .

Example7.2-2

Suppose that scores on a standardized test in mathematics taken by students fromlarge and small high schools are N(µX , σ 2) and N(µY , σ 2), respectively, where σ 2 isunknown. If a random sample of n = 9 students from large high schools yieldedx = 81.31, s2

x = 60.76, and a random sample of m = 15 students from small highschools yielded y = 78.61, s2

y = 48.24, then the endpoints for a 95% confidenceinterval for µX − µY are given by

81.31 − 78.61 ± 2.074

√8(60.76) + 14(48.24)

22

√19

+ 115

because t0.025(22) = 2.074. The 95% confidence interval is [−3.65, 9.05].

REMARKS The assumption of equal variances, namely, σ 2X = σ 2

Y , can be modifiedsomewhat so that we are still able to find a confidence interval for µX − µY . That is,if we know the ratio σ 2

X/σ 2Y of the variances, we can still make this type of statistical

Section 7.3 Confidence Intervals for Proportions 319

P

[

−zα/2 ≤ (Y/n) − p√

p(1 − p)/n≤ zα/2

]

≈ 1 − α. (7.3-1)

If we proceed as we did when we found a confidence interval for µ in Section 7.1,we would obtain

P

[Yn

− zα/2

√p(1 − p)

n≤ p ≤ Y

n+ zα/2

√p(1 − p)

n

]

≈ 1 − α.

Unfortunately, the unknown parameter p appears in the endpoints of this inequality.There are two ways out of this dilemma. First, we could make an additional approx-imation, namely, replacing p with Y/n in p (1 − p)/n in the endpoints. That is, if n islarge enough, it is still true that

P

[Yn

− zα/2

√(Y/n)(1 − Y/n)

n≤ p ≤ Y

n+ zα/2

√(Y/n)(1 − Y/n)

n

]

≈ 1 − α.

Thus, for large n, if the observed Y equals y, then the interval[

yn

− zα/2

√(y/n)(1 − y/n)

n,

yn

+ zα/2

√(y/n)(1 − y/n)

n

]

serves as an approximate 100(1 − α)% confidence interval for p. Frequently, thisinterval is written as

yn

± zα/2

√(y/n)(1 − y/n)

n(7.3-2)

for brevity. This formulation clearly notes, as does x ± zα/2(σ/√

n) in Section 7.1, thereliability of the estimate y/n, namely, that we are 100(1 − α)% confident that p iswithin zα/2

√(y/n)(1 − y/n)/n of p = y/n.

A second way to solve for p in the inequality in Equation 7.3-1 is to note that

|Y/n − p|√

p (1 − p)/n≤ zα/2

is equivalent to

H(p) =(

Yn

− p)2

−z2α/2 p(1 − p)

n≤ 0. (7.3-3)

But H(p) is a quadratic expression in p. Thus, we can find those values of p forwhich H(p) ≤ 0 by finding the two zeros of H(p). Letting p = Y/n and z0 = zα/2 inEquation 7.3-3, we have

H(p) =(

1 + z20

n

)

p2 −(

2 p + z20

n

)

p + p 2.

By the quadratic formula, the zeros of H(p) are, after simplifications,

p + z20/(2n) ± z0

√p (1 − p )/n + z2

0/(4n2)

1 + z20/n

, (7.3-4)

358 Chapter 8 Tests of Statistical Hypotheses

Table 8.1-1 Tests of hypotheses about one mean, variance known

H0 H1 Critical Region

µ = µ0 µ > µ0 z ≥ zα or x ≥ µ0 + zασ/√

n

µ = µ0 µ < µ0 z ≤ −zα or x ≤ µ0 − zασ/√

n

µ = µ0 µ %= µ0 |z| ≥ zα/2 or |x − µ0| ≥ zα/2σ/√

n

Z = X − µ0√σ 2/n

= X − µ0

σ/√

n, (8.1-1)

and the critical regions, at a significance level α, for the three respective alternativehypotheses would be (i) z ≥ zα , (ii) z ≤ −zα , and (iii) |z| ≥ zα/2. In terms of x, thesethree critical regions become (i) x ≥ µ0 + zα(σ/

√n ), (ii) x ≤ µ0 − zα(σ/

√n ), and

(iii) |x − µ0| ≥ zα/2(σ/√

n ).The three tests and critical regions are summarized in Table 8.1-1. The underly-

ing assumption is that the distribution is N(µ, σ 2) and σ 2 is known.It is usually the case that the variance σ 2 is not known. Accordingly, we now take

a more realistic position and assume that the variance is unknown. Suppose our nullhypothesis is H0: µ = µ0 and the two-sided alternative hypothesis is H1: µ %= µ0.Recall from Section 7.1, for a random sample X1, X2, . . . , Xn taken from a normaldistribution N(µ, σ 2), a confidence interval for µ is based on

T = X − µ√

S2/n= X − µ

S/√

n.

This suggests that T might be a good statistic to use for the test of H0: µ = µ0 with µ

replaced by µ0. In addition, it is the natural statistic to use if we replace σ 2/n by itsunbiased estimator S2/n in (X − µ0)/

√σ 2/n in Equation 8.1-1. If µ = µ0, we know

that T has a t distribution with n − 1 degrees of freedom. Thus, with µ = µ0,

P[ |T| ≥ tα/2(n−1)] = P

[|X − µ0|

S/√

n≥ tα/2(n−1)

]

= α.

Accordingly, if x and s are, respectively, the sample mean and sample standarddeviation, then the rule that rejects H0: µ = µ0 and accepts H1: µ %= µ0 if andonly if

|t| = |x − µ0|s/

√n

≥ tα/2(n−1)

provides a test of this hypothesis with significance level α. Note that this rule isequivalent to rejecting H0: µ = µ0 if µ0 is not in the open 100(1 − α)% confidenceinterval

(x − tα/2(n−1)

[s/

√n

], x + tα/2(n−1)

[s/

√n

]).

Table 8.1-2 summarizes tests of hypotheses for a single mean, along with thethree possible alternative hypotheses, when the underlying distribution is N(µ, σ 2),σ 2 is unknown, t = (x − µ0)/(s/

√n ), and n ≤ 30. If n > 30, we use Table 8.1-1 for

approximate tests, with σ replaced by s.

Page 16: Discrete Distributions - Kennesaw State University

Section 8.1 Tests About One Mean 359

Table 8.1-2 Tests of hypotheses for one mean, variance unknown

H0 H1 Critical Region

µ = µ0 µ > µ0 t ≥ tα(n − 1) or x ≥ µ0 + tα(n − 1)s/√

n

µ = µ0 µ < µ0 t ≤ −tα(n − 1) or x ≤ µ0 − tα(n − 1)s/√

n

µ = µ0 µ %= µ0 |t| ≥ tα/2(n − 1) or |x − µ0| ≥ tα/2(n − 1)s/√

n

Example8.1-3

Let X (in millimeters) equal the growth in 15 days of a tumor induced in a mouse.Assume that the distribution of X is N(µ, σ 2). We shall test the null hypothesis H0:µ = µ0 = 4.0 mm against the two-sided alternative hypothesis H1: µ %= 4.0. If weuse n = 9 observations and a significance level of α = 0.10, the critical region is

|t| = |x − 4.0|s/

√9

≥ tα/2(8) = 1.860.

If we are given that n = 9, x = 4.3, and s = 1.2, we see that

t = 4.3 − 4.0

1.2/√

9= 0.3

0.4= 0.75.

Thus,

|t| = |0.75| < 1.860,

and we accept (do not reject) H0: µ = 4.0 at the α = 10% significance level. (SeeFigure 8.1-3.) The p-value is the two-sided probability of |T| ≥ 0.75, namely,

p-value = P(|T| ≥ 0.75) = 2P(T ≥ 0.75).

With our t tables with eight degrees of freedom, we cannot find this p-value exactly.It is about 0.50, because

P(|T| ≥ 0.706) = 2P(T ≥ 0.706) = 0.50.

However, Minitab gives a p-value of 0.4747. (See Figure 8.1-3.)

α/2 = 0.05α/2 = 0.05

p-value

t = 0.75

0.1

0.2

0.3

0.4

−3 −2 −1 3210

0.1

0.2

0.3

0.4

−3 −2 −1 3210

Figure 8.1-3 Test about mean of tumor growths

Section 8.2 Tests of the Equality of Two Means 367

Y

X

0.5 1.0 1.5 2.0 2.5

Figure 8.2-2 Box plots for pea stem growths

for the X sample and

0.8 1.15 1.6 2.2 2.6

for the Y sample. The two box plots are shown in Figure 8.2-2.

Assuming independent random samples of sizes n and m, let x, y, and s2p rep-

resent the observed unbiased estimates of the respective parameters µX , µY , andσ 2

X = σ 2Y of two normal distributions with a common variance. Then α-level tests of

certain hypotheses are given in Table 8.2-1 when σ 2X = σ 2

Y . If the common-varianceassumption is violated, but not too badly, the test is satisfactory, but the significancelevels are only approximate. The t statistic and sp are given in Equations 8.2-1 and8.2-2, respectively.

REMARK Again, to emphasize the relationship between confidence intervals andtests of hypotheses, we note that each of the tests in Table 8.2-1 has a correspondingconfidence interval. For example, the first one-sided test is equivalent to saying thatwe reject H0: µX − µY = 0 if zero is not in the one-sided confidence interval withlower bound

x − y − tα(n+m−2)sp√

1/n + 1/m.

Table 8.2-1 Tests of hypotheses for equality of two means

H0 H1 Critical Region

µX = µY µX > µY t ≥ tα(n+m−2) or

x − y ≥ tα(n+m−2)sp√

1/n + 1/m

µX = µY µX < µY t ≤ −tα(n+m−2) or

x − y ≤ −tα(n+m−2)sp√

1/n + 1/m

µX = µY µX $= µY |t| ≥ tα/2(n+m−2) or

|x − y| ≥ tα/2(n+m−2)sp√

1/n + 1/m

376 Chapter 8 Tests of Statistical Hypotheses

rolled to yield a total of n = 8000 observations. Let Y equal the number of timesthat 6 resulted in the 8000 trials. The test statistic is

Z = Y/n − 1/6√

(1/6)(5/6)/n= Y/8000 − 1/6

√(1/6)(5/6)/8000

.

If we use a significance level of α = 0.05, the critical region is

z ≥ z0.05 = 1.645.

The results of the experiment yielded y = 1389, so the calculated value of the teststatistic is

z = 1389/8000 − 1/6√

(1/6)(5/6)/8000= 1.67.

Since

z = 1.67 > 1.645,

the null hypothesis is rejected, and the experimental results indicate that these dicefavor a 6 more than a fair die would. (You could perform your own experiment tocheck out other dice.)

There are times when a two-sided alternative is appropriate; that is, here wetest H0: p = p0 against H1: p #= p0. For example, suppose that the pass rate in theusual beginning statistics course is p0. There has been an intervention (say, some newteaching method) and it is not known whether the pass rate will increase, decrease, orstay about the same. Thus, we test the null (no-change) hypothesis H0: p = p0 againstthe two-sided alternative H1: p #= p0. A test with the approximate significance levelα for doing this is to reject H0: p = p0 if

|Z| = |Y/n − p0|√p0(1 − p0)/n

≥ zα/2,

since, under H0, P(|Z| ≥ zα/2) ≈ α. These tests of approximate significance level α

are summarized in Table 8.3-1. The rejection region for H0 is often called the criticalregion of the test, and we use that terminology in the table.

The p-value associated with a test is the probability, under the null hypothesisH0, that the test statistic (a random variable) is equal to or exceeds the observedvalue (a constant) of the test statistic in the direction of the alternative hypothesis.

Table 8.3-1 Tests of hypotheses for one proportion

H0 H1 Critical Region

p = p0 p > p0 z = y/n − p0√p0(1 − p0)/n

≥ zα

p = p0 p < p0 z = y/n − p0√p0(1 − p0)/n

≤ −zα

p = p0 p #= p0 |z| = |y/n − p0|√p0(1 − p0)/n

≥ zα/2

Page 17: Discrete Distributions - Kennesaw State University

Confidence IntervalsParameter Assumptions Endpoints

µ N(µ, σ 2) or n large, x ± zα/2σ√n

σ 2 known

µ N(µ, σ 2) x ± tα/2(n−1)s√n

σ 2 unknown

µX − µY N(µX , σ 2X) x − y ± zα/2

√σ 2

X

n+ σ 2

Y

mN(µY , σ 2

Y )σ 2

X , σ 2Y known

µX − µY Variances unknown, x − y ± zα/2

√s2

x

n+

s2y

mlarge samples

µX − µY N(µX , σ 2X) x − y ± tα/2(n+m−2)sp

√1n

+ 1m

,

N(µY , σ 2Y )

σ 2X = σ 2

Y , unknown sp =

√(n − 1)s2

x + (m − 1)s2y

n + m − 2

µD = µX − µY X and Y normal, d ± tα/2(n−1)sd√

nbut dependent

p b(n, p) yn

± zα/2

√(y/n)[1 − (y/n)]

nn is large

p1 − p2 b(n1, p1) y1

n1− y2

n2± zα/2

√p1(1 − p1)

n1+ p2(1 − p2)

n2,

b(n2, p2)p1 = y1/n1, p2 = y2/n2

Page 18: Discrete Distributions - Kennesaw State University

484 Appendix B Tables

Table I Binomial Coefficients(

nr

)= n!

r!(n − r)! =(

nn − r

)

n(

n0

) (n1

) (n2

) (n3

) (n4

) (n5

) (n6

) (n7

) (n8

) (n9

) (n10

) (n11

) (n12

) (n13

)

0 1

1 1 1

2 1 2 1

3 1 3 3 1

4 1 4 6 4 1

5 1 5 10 10 5 1

6 1 6 15 20 15 6 1

7 1 7 21 35 35 21 7 1

8 1 8 28 56 70 56 28 8 1

9 1 9 36 84 126 126 84 36 9 1

10 1 10 45 120 210 252 210 120 45 10 1

11 1 11 55 165 330 462 462 330 165 55 11 1

12 1 12 66 220 495 792 924 792 495 220 66 12 1

13 1 13 78 286 715 1,287 1,716 1,716 1,287 715 286 78 13 1

14 1 14 91 364 1,001 2,002 3,003 3,432 3,003 2,002 1,001 364 91 14

15 1 15 105 455 1,365 3,003 5,005 6,435 6,435 5,005 3,003 1,365 455 105

16 1 16 120 560 1,820 4,368 8,008 11,440 12,870 11,440 8,008 4,368 1,820 560

17 1 17 136 680 2,380 6,188 12,376 19,448 24,310 24,310 19,448 12,376 6,188 2,380

18 1 18 153 816 3,060 8,568 18,564 31,824 43,758 48,620 43,758 31,824 18,564 8,568

19 1 19 171 969 3,876 11,628 27,132 50,388 75,582 92,378 92,378 75,582 50,388 27,132

20 1 20 190 1,140 4,845 15,504 38,760 77,520 125,970 167,960 184,756 167,960 125,970 77,520

21 1 21 210 1,330 5,985 20,349 54,264 116,280 203,490 293,930 352,716 352,716 293,930 203,490

22 1 22 231 1,540 7,315 26,334 74,613 170,544 319,770 497,420 646,646 705,432 646,646 497,420

23 1 23 253 1,771 8,855 33,649 100,947 245,157 490,314 817,190 1,144,066 1,352,078 1,352,078 1,144,066

24 1 24 276 2,024 10,626 42,504 134,596 346,104 735,471 1,307,504 1,961,256 2,496,144 2,704,156 2,496,144

25 1 25 300 2,300 12,650 53,130 177,100 480,700 1,081,575 2,042,975 3,268,760 4,457,400 5,200,300 5,200,300

26 1 26 325 2,600 14,950 65,780 230,230 657,800 1,562,275 3,124,550 5,311,735 7,726,160 9,657,700 10,400,600

For r > 13 you may use the identity(

nr

)=

(n

n − r

).

Page 19: Discrete Distributions - Kennesaw State University

Table II The Binomial Distribution

x

f(x)

F(x)b(8, 0.35)

0.05

0.10

0.15

0.20

0.25

0.30

0 2 4 6 8 x

b(8, 0.35)

0.5

1.0

0 2 4 6 8

F(x) = P(X ≤ x) =x∑

k=0

n!k!(n − k)! pk(1 − p)n−k

p

n x 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50

2 0 0.9025 0.8100 0.7225 0.6400 0.5625 0.4900 0.4225 0.3600 0.3025 0.25001 0.9975 0.9900 0.9775 0.9600 0.9375 0.9100 0.8775 0.8400 0.7975 0.75002 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

3 0 0.8574 0.7290 0.6141 0.5120 0.4219 0.3430 0.2746 0.2160 0.1664 0.12501 0.9928 0.9720 0.9392 0.8960 0.8438 0.7840 0.7182 0.6480 0.5748 0.50002 0.9999 0.9990 0.9966 0.9920 0.9844 0.9730 0.9571 0.9360 0.9089 0.87503 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

4 0 0.8145 0.6561 0.5220 0.4096 0.3164 0.2401 0.1785 0.1296 0.0915 0.06251 0.9860 0.9477 0.8905 0.8192 0.7383 0.6517 0.5630 0.4752 0.3910 0.31252 0.9995 0.9963 0.9880 0.9728 0.9492 0.9163 0.8735 0.8208 0.7585 0.68753 1.0000 0.9999 0.9995 0.9984 0.9961 0.9919 0.9850 0.9744 0.9590 0.93754 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

5 0 0.7738 0.5905 0.4437 0.3277 0.2373 0.1681 0.1160 0.0778 0.0503 0.03121 0.9774 0.9185 0.8352 0.7373 0.6328 0.5282 0.4284 0.3370 0.2562 0.18752 0.9988 0.9914 0.9734 0.9421 0.8965 0.8369 0.7648 0.6826 0.5931 0.50003 1.0000 0.9995 0.9978 0.9933 0.9844 0.9692 0.9460 0.9130 0.8688 0.81254 1.0000 1.0000 0.9999 0.9997 0.9990 0.9976 0.9947 0.9898 0.9815 0.96885 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

6 0 0.7351 0.5314 0.3771 0.2621 0.1780 0.1176 0.0754 0.0467 0.0277 0.01561 0.9672 0.8857 0.7765 0.6553 0.5339 0.4202 0.3191 0.2333 0.1636 0.10942 0.9978 0.9842 0.9527 0.9011 0.8306 0.7443 0.6471 0.5443 0.4415 0.34383 0.9999 0.9987 0.9941 0.9830 0.9624 0.9295 0.8826 0.8208 0.7447 0.65624 1.0000 0.9999 0.9996 0.9984 0.9954 0.9891 0.9777 0.9590 0.9308 0.89065 1.0000 1.0000 1.0000 0.9999 0.9998 0.9993 0.9982 0.9959 0.9917 0.98446 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

7 0 0.6983 0.4783 0.3206 0.2097 0.1335 0.0824 0.0490 0.0280 0.0152 0.00781 0.9556 0.8503 0.7166 0.5767 0.4449 0.3294 0.2338 0.1586 0.1024 0.0625

Page 20: Discrete Distributions - Kennesaw State University

Table II continued

p

n x 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50

2 0.9962 0.9743 0.9262 0.8520 0.7564 0.6471 0.5323 0.4199 0.3164 0.22663 0.9998 0.9973 0.9879 0.9667 0.9294 0.8740 0.8002 0.7102 0.6083 0.50004 1.0000 0.9998 0.9988 0.9953 0.9871 0.9712 0.9444 0.9037 0.8471 0.77345 1.0000 1.0000 0.9999 0.9996 0.9987 0.9962 0.9910 0.9812 0.9643 0.93756 1.0000 1.0000 1.0000 1.0000 0.9999 0.9998 0.9994 0.9984 0.9963 0.99227 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

8 0 0.6634 0.4305 0.2725 0.1678 0.1001 0.0576 0.0319 0.0168 0.0084 0.00391 0.9428 0.8131 0.6572 0.5033 0.3671 0.2553 0.1691 0.1064 0.0632 0.03522 0.9942 0.9619 0.8948 0.7969 0.6785 0.5518 0.4278 0.3154 0.2201 0.14453 0.9996 0.9950 0.9786 0.9437 0.8862 0.8059 0.7064 0.5941 0.4770 0.36334 1.0000 0.9996 0.9971 0.9896 0.9727 0.9420 0.8939 0.8263 0.7396 0.63675 1.0000 1.0000 0.9998 0.9988 0.9958 0.9887 0.9747 0.9502 0.9115 0.85556 1.0000 1.0000 1.0000 0.9999 0.9996 0.9987 0.9964 0.9915 0.9819 0.96487 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9998 0.9993 0.9983 0.99618 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

9 0 0.6302 0.3874 0.2316 0.1342 0.0751 0.0404 0.0207 0.0101 0.0046 0.00201 0.9288 0.7748 0.5995 0.4362 0.3003 0.1960 0.1211 0.0705 0.0385 0.01952 0.9916 0.9470 0.8591 0.7382 0.6007 0.4628 0.3373 0.2318 0.1495 0.08983 0.9994 0.9917 0.9661 0.9144 0.8343 0.7297 0.6089 0.4826 0.3614 0.25394 1.0000 0.9991 0.9944 0.9804 0.9511 0.9012 0.8283 0.7334 0.6214 0.50005 1.0000 0.9999 0.9994 0.9969 0.9900 0.9747 0.9464 0.9006 0.8342 0.74616 1.0000 1.0000 1.0000 0.9997 0.9987 0.9957 0.9888 0.9750 0.9502 0.91027 1.0000 1.0000 1.0000 1.0000 0.9999 0.9996 0.9986 0.9962 0.9909 0.98058 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9997 0.9992 0.99809 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

10 0 0.5987 0.3487 0.1969 0.1074 0.0563 0.0282 0.0135 0.0060 0.0025 0.00101 0.9139 0.7361 0.5443 0.3758 0.2440 0.1493 0.0860 0.0464 0.0233 0.01072 0.9885 0.9298 0.8202 0.6778 0.5256 0.3828 0.2616 0.1673 0.0996 0.05473 0.9990 0.9872 0.9500 0.8791 0.7759 0.6496 0.5138 0.3823 0.2660 0.17194 0.9999 0.9984 0.9901 0.9672 0.9219 0.8497 0.7515 0.6331 0.5044 0.37705 1.0000 0.9999 0.9986 0.9936 0.9803 0.9527 0.9051 0.8338 0.7384 0.62306 1.0000 1.0000 0.9999 0.9991 0.9965 0.9894 0.9740 0.9452 0.8980 0.82817 1.0000 1.0000 1.0000 0.9999 0.9996 0.9984 0.9952 0.9877 0.9726 0.94538 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9995 0.9983 0.9955 0.98939 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9997 0.9990

10 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

11 0 0.5688 0.3138 0.1673 0.0859 0.0422 0.0198 0.0088 0.0036 0.0014 0.00051 0.8981 0.6974 0.4922 0.3221 0.1971 0.1130 0.0606 0.0302 0.0139 0.00592 0.9848 0.9104 0.7788 0.6174 0.4552 0.3127 0.2001 0.1189 0.0652 0.03273 0.9984 0.9815 0.9306 0.8389 0.7133 0.5696 0.4256 0.2963 0.1911 0.11334 0.9999 0.9972 0.9841 0.9496 0.8854 0.7897 0.6683 0.5328 0.3971 0.27445 1.0000 0.9997 0.9973 0.9883 0.9657 0.9218 0.8513 0.7535 0.6331 0.50006 1.0000 1.0000 0.9997 0.9980 0.9924 0.9784 0.9499 0.9006 0.8262 0.7256

Page 21: Discrete Distributions - Kennesaw State University

Table II continued

p

n x 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50

7 1.0000 1.0000 1.0000 0.9998 0.9988 0.9957 0.9878 0.9707 0.9390 0.88678 1.0000 1.0000 1.0000 1.0000 0.9999 0.9994 0.9980 0.9941 0.9852 0.96739 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9998 0.9993 0.9978 0.9941

10 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9998 0.999511 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

12 0 0.5404 0.2824 0.1422 0.0687 0.0317 0.0138 0.0057 0.0022 0.0008 0.00021 0.8816 0.6590 0.4435 0.2749 0.1584 0.0850 0.0424 0.0196 0.0083 0.00322 0.9804 0.8891 0.7358 0.5583 0.3907 0.2528 0.1513 0.0834 0.0421 0.01933 0.9978 0.9744 0.9078 0.7946 0.6488 0.4925 0.3467 0.2253 0.1345 0.07304 0.9998 0.9957 0.9761 0.9274 0.8424 0.7237 0.5833 0.4382 0.3044 0.19385 1.0000 0.9995 0.9954 0.9806 0.9456 0.8822 0.7873 0.6652 0.5269 0.38726 1.0000 0.9999 0.9993 0.9961 0.9857 0.9614 0.9154 0.8418 0.7393 0.61287 1.0000 1.0000 0.9999 0.9994 0.9972 0.9905 0.9745 0.9427 0.8883 0.80628 1.0000 1.0000 1.0000 0.9999 0.9996 0.9983 0.9944 0.9847 0.9644 0.92709 1.0000 1.0000 1.0000 1.0000 1.0000 0.9998 0.9992 0.9972 0.9921 0.9807

10 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9997 0.9989 0.996811 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.999812 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

13 0 0.5133 0.2542 0.1209 0.0550 0.0238 0.0097 0.0037 0.0013 0.0004 0.00011 0.8646 0.6213 0.3983 0.2336 0.1267 0.0637 0.0296 0.0126 0.0049 0.00172 0.9755 0.8661 0.6920 0.5017 0.3326 0.2025 0.1132 0.0579 0.0269 0.01123 0.9969 0.9658 0.8820 0.7473 0.5843 0.4206 0.2783 0.1686 0.0929 0.04614 0.9997 0.9935 0.9658 0.9009 0.7940 0.6543 0.5005 0.3530 0.2279 0.13345 1.0000 0.9991 0.9924 0.9700 0.9198 0.8346 0.7159 0.5744 0.4268 0.29056 1.0000 0.9999 0.9987 0.9930 0.9757 0.9376 0.8705 0.7712 0.6437 0.50007 1.0000 1.0000 0.9998 0.9988 0.9944 0.9818 0,9538 0.9023 0.8212 0.70958 1.0000 1.0000 1.0000 0.9998 0.9990 0.9960 0.9874 0.9679 0.9302 0.86669 1.0000 1.0000 1.0000 1.0000 0.9999 0.9993 0.9975 0.9922 0.9797 0.9539

10 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9997 0.9987 0.9959 0.988811 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9995 0.998312 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.999913 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

14 0 0.4877 0.2288 0.1028 0.0440 0.0178 0.0068 0.0024 0.0008 0.0002 0.00011 0.8470 0.5846 0.3567 0.1979 0.1010 0.0475 0.0205 0.0081 0.0029 0.00092 0.9699 0.8416 0.6479 0.4481 0.2811 0.1608 0.0839 0.0398 0.0170 0.00653 0.9958 0.9559 0.8535 0.6982 0.5213 0.3552 0.2205 0.1243 0.0632 0.02874 0.9996 0.9908 0.9533 0.8702 0.7415 0.5842 0.4227 0.2793 0.1672 0.08985 1.0000 0.9985 0.9885 0.9561 0.8883 0.7805 0.6405 0.4859 0.3373 0.21206 1.0000 0.9998 0.9978 0.9884 0.9617 0.9067 0.8164 0.6925 0.5461 0.39537 1.0000 1.0000 0.9997 0.9976 0.9897 0.9685 0.9247 0.8499 0.7414 0.60478 1.0000 1.0000 1.0000 0.9996 0.9978 0.9917 0.9757 0.9417 0.8811 0.78809 1.0000 1.0000 1.0000 1.0000 0.9997 0.9983 0.9940 0.9825 0.9574 0.9102

10 1.0000 1.0000 1.0000 1.0000 1.0000 0.9998 0.9989 0.9961 0.9886 0.9713

Page 22: Discrete Distributions - Kennesaw State University

Table II continued

p

n x 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50

11 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9994 0.9978 0.993512 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9997 0.999113 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.999914 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

15 0 0.4633 0.2059 0.0874 0.0352 0.0134 0.0047 0.0016 0.0005 0.0001 0.00001 0.8290 0.5490 0.3186 0.1671 0.0802 0.0353 0.0142 0.0052 0.0017 0.00052 0.9638 0.8159 0.6042 0.3980 0.2361 0.1268 0.0617 0.0271 0.0107 0.00373 0.9945 0.9444 0.8227 0.6482 0.4613 0.2969 0.1727 0.0905 0.0424 0.01764 0.9994 0.9873 0.9383 0.8358 0.6865 0.5155 0.3519 0.2173 0.1204 0.05925 0.9999 0.9978 0.9832 0.9389 0.8516 0.7216 0.5643 0.4032 0.2608 0.15096 1.0000 0.9997 0.9964 0.9819 0.9434 0.8689 0.7548 0.6098 0.4522 0.30367 1.0000 1.0000 0.9994 0.9958 0.9827 0.9500 0.8868 0.7869 0.6535 0.50008 1.0000 1.0000 0.9999 0.9992 0.9958 0.9848 0.9578 0.9050 0.8182 0.69649 1.0000 1.0000 1.0000 0.9999 0.9992 0.9963 0.9876 0.9662 0.9231 0.8491

10 1.0000 1.0000 1.0000 1.0000 0.9999 0.9993 0.9972 0.9907 0.9745 0.940811 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9995 0.9981 0.9937 0.982412 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9987 0.9989 0.996313 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.999514 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.000015 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

16 0 0.4401 0.1853 0.0743 0.0281 0.0100 0.0033 0.0010 0.0003 0.0001 0.00001 0.8108 0.5147 0.2839 0.1407 0.0635 0.0261 0.0098 0.0033 0.0010 0.00032 0.9571 0.7892 0.5614 0.3518 0.1971 0.0994 0.0451 0.0183 0.0066 0.00213 0.9930 0.9316 0.7899 0.5981 0.4050 0.2459 0.1339 0,0651 0.0281 0.01064 0.9991 0.9830 0.9209 0.7982 0.6302 0.4499 0.2892 0.1666 0.0853 0.03845 0.9999 0.9967 0.9765 0.9183 0.8103 0.6598 0.4900 0.3288 0.1976 0.10516 1.0000 0.9995 0.9944 0.9733 0.9204 0.8247 0.6881 0.5272 0.3660 0.22727 1.0000 0.9999 0.9989 0.9930 0.9729 0.9256 0.8406 0.7161 0.5629 0.40188 1.0000 1.0000 0.9998 0.9985 0.9925 0.9743 0.9329 0.8577 0.7441 0.59829 1.0000 1.0000 1.0000 0.9998 0.9984 0.9929 0.9771 0.9417 0.8759 0.7728

10 1.0000 1.0000 1.0000 1.0000 0.9997 0.9984 0.9938 0.9809 0.9514 0.894911 1.0000 1.0000 1.0000 1.0000 1.0000 0.9997 0.9987 0.9951 0.9851 0.961612 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9998 0.9991 0.9965 0.989413 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9994 0.997914 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.999715 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.000016 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

20 0 0.3585 0.1216 0.0388 0.0115 0.0032 0.0008 0.0002 0.0000 0.0000 0.00001 0.7358 0.3917 0.1756 0.0692 0.0243 0.0076 0.0021 0.0005 0.0001 0.00002 0.9245 0.6769 0.4049 0.2061 0.0913 0.0355 0.0121 0.0036 0.0009 0.00023 0.9841 0.8670 0.6477 0.4114 0.2252 0.1071 0.0444 0.0160 0.0049 0.00134 0.9974 0.9568 0.8298 0.6296 0.4148 0.2375 0.1182 0.0510 0.0189 0.0059

Page 23: Discrete Distributions - Kennesaw State University

Table II continued

p

n x 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50

5 0.9997 0.9887 0.9327 0.8042 0.6172 0.4164 0.2454 0.1256 0.0553 0.02076 1.0000 0.9976 0.9781 0.9133 0.7858 0.6080 0.4166 0.2500 0.1299 0.05777 1.0000 0.9996 0.9941 0.9679 0.8982 0.7723 0.6010 0.4159 0.2520 0.13168 1.0000 0.9999 0.9987 0.9900 0.9591 0.8867 0.7624 0.5956 0.4143 0.25179 1.0000 1.0000 0.9998 0.9974 0.9861 0.9520 0.8782 0.7553 0.5914 0.4119

10 1.0000 1.0000 1.0000 0.9994 0.9961 0.9829 0.9468 0.8725 0.7507 0.588111 1.0000 1.0000 1.0000 0.9999 0.9991 0.9949 0.9804 0.9435 0.8692 0.748312 1.0000 1.0000 1.0000 1.0000 0.9998 0.9987 0.9940 0.9790 0.9420 0.868413 1.0000 1.0000 1.0000 1.0000 1.0000 0.9997 0.9985 0.9935 0.9786 0.942314 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9997 0.9984 0.9936 0.979315 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9997 0.9985 0.994116 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9997 0.998717 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.999818 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.000019 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.000020 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

25 0 0.2774 0.0718 0.0172 0.0038 0.0008 0.0001 0.0000 0.0000 0.0000 0.00001 0.6424 0.2712 0.0931 0.0274 0.0070 0.0016 0.0003 0.0001 0.0000 0.00002 0.8729 0.5371 0.2537 0.0982 0.0321 0.0090 0.0021 0.0004 0.0001 0.00003 0.9659 0.7636 0.4711 0.2340 0.0962 0.0332 0.0097 0.0024 0.0005 0.00014 0.9928 0.9020 0.6821 0.4207 0.2137 0.0905 0.0320 0.0095 0.0023 0.00055 0.9988 0.9666 0.8385 0.6167 0.3783 0.1935 0.0826 0.0294 0.0086 0.00206 0.9998 0.9905 0.9305 0.7800 0.5611 0.3407 0.1734 0.0736 0.0258 0.00737 1.0000 0.9977 0.9745 0.8909 0.7265 0.5118 0.3061 0.1536 0.0639 0.02168 1.0000 0.9995 0.9920 0.9532 0.8506 0.6769 0.4668 0.2735 0.1340 0.05399 1.0000 0.9999 0.9979 0.9827 0.9287 0.8106 0.6303 0.4246 0.2424 0.1148

10 1.0000 1.0000 0.9995 0.9944 0.9703 0.9022 0.7712 0.5858 0.3843 0.212211 1.0000 1.0000 0.9999 0.9985 0.9893 0.9558 0.8746 0.7323 0.5426 0.345012 1.0000 1.0000 1.0000 0.9996 0.9966 0.9825 0.9396 0.8462 0.6937 0.500013 1.0000 1.0000 1.0000 0.9999 0.9991 0.9940 0.9745 0.9222 0.8173 0.655014 1.0000 1.0000 1,0000 1.0000 0.9998 0.9982 0.9907 0.9656 0.9040 0.787815 1.0000 1.0000 1.0000 1.0000 1.0000 0.9995 0.9971 0.9868 0.9560 0.885216 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9992 0.9957 0.9826 0.946117 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9998 0.9988 0.9942 0.978418 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9997 0.9984 0.992719 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.9996 0.998020 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9999 0.999521 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.999922 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.000023 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.000024 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.000025 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

Page 24: Discrete Distributions - Kennesaw State University

Table III The Poisson Distribution

Poisson, λ = 3.8

x

0.05

0.10

0.15

0.20

0

f(x)

2 4 6 8 10 12 x

Poisson, λ = 3.8

0.2

0.4

0.6

0.8

1.0

0

F(x)

2 4 6 8 10 12

F(x) = P(X ≤ x) =x∑

k=0

λke−λ

k!

λ = E(X)

x 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

0 0.905 0.819 0.741 0.670 0.607 0.549 0.497 0.449 0.407 0.3681 0.995 0.982 0.963 0.938 0.910 0.878 0.844 0.809 0.772 0.7362 1.000 0.999 0.996 0.992 0.986 0.977 0.966 0.953 0.937 0.9203 1.000 1.000 1.000 0.999 0.998 0.997 0.994 0.991 0.987 0.9814 1.000 1.000 1.000 1.000 1.000 1.000 0.999 0.999 0.998 0.996

5 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.9996 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000

x 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0

0 0.333 0.301 0.273 0.247 0.223 0.202 0.183 0.165 0.150 0.1351 0.699 0.663 0.627 0.592 0.558 0.525 0.493 0.463 0.434 0.4062 0.900 0.879 0.857 0.833 0.809 0.783 0.757 0.731 0.704 0.6773 0.974 0.966 0.957 0.946 0.934 0.921 0.907 0.891 0.875 0.8574 0.995 0.992 0.989 0.986 0.981 0.976 0.970 0.964 0.956 0.947

5 0.999 0.998 0.998 0.997 0.996 0.994 0.992 0.990 0.987 0.9836 1.000 1.000 1.000 0.999 0.999 0.999 0.998 0.997 0.997 0.9957 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.999 0.999 0.9998 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000

x 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0

0 0.111 0.091 0.074 0.061 0.050 0.041 0.033 0.027 0.022 0.0181 0.355 0.308 0.267 0.231 0.199 0.171 0.147 0.126 0.107 0.0922 0.623 0.570 0.518 0.469 0.423 0.380 0.340 0.303 0.269 0.2383 0.819 0.779 0.736 0.692 0.647 0.603 0.558 0.515 0.473 0.4334 0.928 0.904 0.877 0.848 0.815 0.781 0.744 0.706 0.668 0.629

5 0.975 0.964 0.951 0.935 0.916 0.895 0.871 0.844 0.816 0.7856 0.993 0.988 0.983 0.976 0.966 0.955 0.942 0.927 0.909 0.8897 0.998 0.997 0.995 0.992 0.988 0.983 0.977 0.969 0.960 0.9498 1.000 0.999 0.999 0.998 0.996 0.994 0.992 0.988 0.984 0.9799 1.000 1.000 1.000 0.999 0.999 0.998 0.997 0.996 0.994 0.992

10 1.000 1.000 1.000 1.000 1.000 1.000 0.999 0.999 0.998 0.99711 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.999 0.99912 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000

Page 25: Discrete Distributions - Kennesaw State University

Appendix B Tables 491

Table III continued

x 4.2 4.4 4.6 4.8 5.0 5.2 5.4 5.6 5.8 6.0

0 0.015 0.012 0.010 0.008 0.007 0.006 0.005 0.004 0.003 0.0021 0.078 0.066 0.056 0.048 0.040 0.034 0.029 0.024 0.021 0.0172 0.210 0.185 0.163 0.143 0.125 0.109 0.095 0.082 0.072 0.0623 0.395 0.359 0.326 0.294 0.265 0.238 0.213 0.191 0.170 0.1514 0.590 0.551 0.513 0.476 0.440 0.406 0.373 0.342 0.313 0.285

5 0.753 0.720 0.686 0.651 0.616 0.581 0.546 0.512 0.478 0.4466 0.867 0.844 0.818 0.791 0.762 0.732 0.702 0.670 0.638 0.6067 0.936 0.921 0.905 0.887 0.867 0.845 0.822 0.797 0.771 0.7448 0.972 0.964 0.955 0.944 0.932 0.918 0.903 0.886 0.867 0.8479 0.989 0.985 0.980 0.975 0.968 0.960 0.951 0.941 0.929 0.916

10 0.996 0.994 0.992 0.990 0.986 0.982 0.977 0.972 0.965 0.95711 0.999 0.998 0.997 0.996 0.995 0.993 0.990 0.988 0.984 0.98012 1.000 0.999 0.999 0.999 0.998 0.997 0.996 0.995 0.993 0.99113 1.000 1.000 1.000 1.000 0.999 0.999 0.999 0.998 0.997 0.99614 1.000 1.000 1.000 1.000 1.000 1.000 0.999 0.999 0.999 0.999

15 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.99916 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000

x 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0

0 0.002 0.001 0.001 0.000 0.000 0.000 0.000 0.000 0.000 0.0001 0.011 0.007 0.005 0.003 0.002 0.001 0.001 0.000 0.000 0.0002 0.043 0.030 0.020 0.014 0.009 0.006 0.004 0.003 0.002 0.0013 0.112 0.082 0.059 0.042 0.030 0.021 0.015 0.010 0.007 0.0054 0.224 0.173 0.132 0.100 0.074 0.055 0.040 0.029 0.021 0.015

5 0.369 0.301 0.241 0.191 0.150 0.116 0.089 0.067 0.050 0.0386 0.527 0.450 0.378 0.313 0.256 0.207 0.165 0.130 0.102 0.0797 0.673 0.599 0.525 0.453 0.386 0.324 0.269 0.220 0.179 0.1438 0.792 0.729 0.662 0.593 0.523 0.456 0.392 0.333 0.279 0.2329 0.877 0.830 0.776 0.717 0.653 0.587 0.522 0.458 0.397 0.341

10 0.933 0.901 0.862 0.816 0.763 0.706 0.645 0.583 0.521 0.46011 0.966 0.947 0.921 0.888 0.849 0.803 0.752 0.697 0.639 0.57912 0.984 0.973 0.957 0.936 0.909 0.876 0.836 0.792 0.742 0.68913 0.993 0.987 0.978 0.966 0.949 0.926 0.898 0.864 0.825 0.78114 0.997 0.994 0.990 0.983 0.973 0.959 0.940 0.917 0.888 0.854

15 0.999 0.998 0.995 0.992 0.986 0.978 0.967 0.951 0.932 0.90716 1.000 0.999 0.998 0.996 0.993 0.989 0.982 0.973 0.960 0.94417 1.000 1.000 0.999 0.998 0.997 0.995 0.991 0.986 0.978 0.96818 1.000 1.000 1.000 0.999 0.999 0.998 0.096 0.993 0.988 0.98219 1.000 1.000 1.000 1.000 0.999 0.999 0.998 0.997 0.994 0.991

20 1.000 1.000 1.000 1.000 1.000 1.000 0.999 0.998 0.997 0.99521 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.999 0.999 0.99822 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.999 0.99923 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000

Page 26: Discrete Distributions - Kennesaw State University

492 Appendix B Tables

Table III continued

x 11.5 12.0 12.5 13.0 13.5 14.0 14.5 15.0 15.5 16.0

0 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.0001 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.0002 0.001 0.001 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.0003 0.003 0.002 0.002 0.001 0.001 0.000 0.000 0.000 0.000 0.0004 0.011 0.008 0.005 0.004 0.003 0.002 0.001 0.001 0.001 0.000

5 0.028 0.020 0.015 0.011 0.008 0.006 0.004 0.003 0.002 0.0016 0.060 0.046 0.035 0.026 0.019 0.014 0.010 0.008 0.006 0.0047 0.114 0.090 0.070 0.054 0.041 0.032 0.024 0.018 0.013 0.0108 0.191 0.155 0.125 0.100 0.079 0.062 0.048 0.037 0.029 0.0229 0.289 0.242 0.201 0.166 0.135 0.109 0.088 0.070 0.055 0.043

10 0.402 0.347 0.297 0.252 0.211 0.176 0.145 0.118 0.096 0.07711 0.520 0.462 0.406 0.353 0.304 0.260 0.220 0.185 0.154 0.12712 0.633 0.576 0.519 0.463 0.409 0.358 0.311 0.268 0.228 0.19313 0.733 0.682 0.629 0.573 0.518 0.464 0.413 0.363 0.317 0.27514 0.815 0.772 0.725 0.675 0.623 0.570 0.518 0.466 0.415 0.368

15 0.878 0.844 0.806 0.764 0.718 0.669 0.619 0.568 0.517 0.46716 0.924 0.899 0.869 0.835 0.798 0.756 0.711 0.664 0.615 0.56617 0.954 0.937 0.916 0.890 0.861 0.827 0.790 0.749 0.705 0.65918 0.974 0.963 0.948 0.930 0.908 0.883 0.853 0.819 0.782 0.74219 0.986 0.979 0.969 0.957 0.942 0.923 0.901 0.875 0.846 0.812

20 0.992 0.988 0.983 0.975 0.965 0.952 0.936 0.917 0.894 0.86821 0.996 0.994 0.991 0.986 0.980 0.971 0.960 0.947 0.930 0.91122 0.999 0.997 0.995 0.992 0.989 0.983 0.976 0.967 0.956 0.94223 0.999 0.999 0.998 0.996 0.994 0.991 0.986 0.981 0.973 0.96324 1.000 0.999 0.999 0.998 0.997 0.995 0.992 0.989 0.984 0.978

25 1.000 1.000 0.999 0.999 0.998 0.997 0.996 0.994 0.991 0.98726 1.000 1.000 1.000 1.000 0.999 0.999 0.998 0.997 0.995 0.99327 1.000 1.000 1.000 1.000 1.000 0.999 0.999 0.998 0.997 0.99628 1.000 1.000 1.000 1.000 1.000 1.000 0.999 0.999 0.999 0.99829 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.999 0.999

30 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.99931 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.00032 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.00033 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.00034 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000

35 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000

Page 27: Discrete Distributions - Kennesaw State University

Table IV The Chi-Square Distribution

χ2(8)χ2(8)

χ2α(8)x0

0.05

0.10

2

α

4 6 8 10 12 14 16 18 20 2 4 6 8 10 16 18 200

0.05

0.10

P(X ≤ x) =∫ x

0

1"(r/2)2r/2 wr/2−1e−w/2dw

P(X ≤ x)

0.010 0.025 0.050 0.100 0.900 0.950 0.975 0.990

r χ20.99(r) χ2

0.975(r) χ20.95(r) χ2

0.90(r) χ20.10(r) χ2

0.05(r) χ20.025(r) χ2

0.01(r)

1 0.000 0.001 0.004 0.016 2.706 3.841 5.024 6.6352 0.020 0.051 0.103 0.211 4.605 5.991 7.378 9.2103 0.115 0.216 0.352 0.584 6.251 7.815 9.348 11.344 0.297 0.484 0.711 1.064 7.779 9.488 11.14 13.285 0.554 0.831 1.145 1.610 9.236 11.07 12.83 15.09

6 0.872 1.237 1.635 2.204 10.64 12.59 14.45 16.817 1.239 1.690 2.167 2.833 12.02 14.07 16.01 18.488 1.646 2.180 2.733 3.490 13.36 15.51 17.54 20.099 2.088 2.700 3.325 4.168 14.68 16.92 19.02 21.67

10 2.558 3.247 3.940 4.865 15.99 18.31 20.48 23.21

11 3.053 3.816 4.575 5.578 17.28 19.68 21.92 24.7212 3.571 4.404 5.226 6.304 18.55 21.03 23.34 26.2213 4.107 5.009 5.892 7.042 19.81 22.36 24.74 27.6914 4.660 5.629 6.571 7.790 21.06 23.68 26.12 29.1415 5.229 6.262 7.261 8.547 22.31 25.00 27.49 30.58

16 5.812 6.908 7.962 9.312 23.54 26.30 28.84 32.0017 6.408 7.564 8.672 10.08 24.77 27.59 30.19 33.4118 7.015 8.231 9.390 10.86 25.99 28.87 31.53 34.8019 7.633 8.907 10.12 11.65 27.20 30.14 32.85 36.1920 8.260 9.591 10.85 12.44 28.41 31.41 34.17 37.57

21 8.897 10.28 11.59 13.24 29.62 32.67 35.48 38.9322 9.542 10.98 12.34 14.04 30.81 33.92 36.78 40.2923 10.20 11.69 13.09 14.85 32.01 35.17 38.08 41.6424 10.86 12.40 13.85 15.66 33.20 36.42 39.36 42.9825 11.52 13.12 14.61 16.47 34.38 37.65 40.65 44.31

26 12.20 13.84 15.38 17.29 35.56 38.88 41.92 45.6427 12.88 14.57 16.15 18.11 36.74 40.11 43.19 46.9628 13.56 15.31 16.93 18.94 37.92 41.34 44.46 48.2829 14.26 16.05 17.71 19.77 39.09 42.56 45.72 49.5930 14.95 16.79 18.49 20.60 40.26 43.77 46.98 50.89

40 22.16 24.43 26.51 29.05 51.80 55.76 59.34 63.6950 29.71 32.36 34.76 37.69 63.17 67.50 71.42 76.1560 37.48 40.48 43.19 46.46 74.40 79.08 83.30 88.3870 45.44 48.76 51.74 55.33 85.53 90.53 95.02 100.480 53.34 57.15 60.39 64.28 96.58 101.9 106.6 112.3

This table is abridged and adapted from Table III in Biometrika Tables for Statisticians, edited by E.S.Pearson and H.O.Hartley.

Page 28: Discrete Distributions - Kennesaw State University

Table Va The Standard Normal Distribution Function

Φ(z0)

z0

z

0.1

0.2

0.3

0.4

−3 −2 −1 0 1 2 3

f(z)

P(Z ≤ z) = $(z) =∫ z

−∞

1√2π

e−w2/2 dw

$(−z) = 1 − $(z)

z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.53590.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.57530.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.61410.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.65170.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.68790.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.72240.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.75490.7 0.7580 0.7611 0.7642 0.7673 0.7703 0.7734 0.7764 0.7794 0.7823 0.78520.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.81330.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.83891.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.86211.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.88301.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.90151.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.91771.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.93191.5 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.94411.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.95451.7 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.96331.8 0.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.97061.9 0.9713 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.97672.0 0.9772 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.98172.1 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.98572.2 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.98902.3 0.9893 0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.99162.4 0.9918 0.9920 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.99362.5 0.9938 0.9940 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.99522.6 0.9953 0.9955 0.9956 0.9957 0.9959 0.9960 0.9961 0.9962 0.9963 0.99642.7 0.9965 0.9966 0.9967 0.9968 0.9969 0.9970 0.9971 0.9972 0.9973 0.99742.8 0.9974 0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.99812.9 0.9981 0.9982 0.9982 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.99863.0 0.9987 0.9987 0.9987 0.9988 0.9988 0.9989 0.9989 0.9989 0.9990 0.9990

α 0.400 0.300 0.200 0.100 0.050 0.025 0.020 0.010 0.005 0.001

zα 0.253 0.524 0.842 1.282 1.645 1.960 2.054 2.326 2.576 3.090zα/2 0.842 1.036 1.282 1.645 1.960 2.240 2.326 2.576 2.807 3.291

Page 29: Discrete Distributions - Kennesaw State University

Table Vb The Standard Normal Right-Tail Probabilities

zαz

0.1

0.2

0.3

0.4

−3 −2 −1 0 1 2 3

f(z)

α

P(Z > zα) = α

P(Z > z) = 1 − $(z) = $(−z)

zα 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

0.0 0.5000 0.4960 0.4920 0.4880 0.4840 0.4801 0.4761 0.4721 0.4681 0.46410.1 0.4602 0.4562 0.4522 0.4483 0.4443 0.4404 0.4364 0.4325 0.4286 0.42470.2 0.4207 0.4168 0.4129 0.4090 0.4052 0.4013 0.3974 0.3936 0.3897 0.38590.3 0.3821 0.3783 0.3745 0.3707 0.3669 0.3632 0.3594 0.3557 0.3520 0.34830.4 0.3446 0.3409 0.3372 0.3336 0.3300 0.3264 0.3228 0.3192 0.3156 0.31210.5 0.3085 0.3050 0.3015 0.2981 0.2946 0.2912 0.2877 0.2843 0.2810 0.27760.6 0.2743 0.2709 0.2676 0.2643 0.2611 0.2578 0.2546 0.2514 0.2483 0.24510.7 0.2420 0.2389 0.2358 0.2327 0.2296 0.2266 0.2236 0.2206 0.2177 0.21480.8 0.2119 0.2090 0.2061 0.2033 0.2005 0.1977 0.1949 0.1922 0.1894 0.18670.9 0.1841 0.1814 0.1788 0.1762 0.1736 0.1711 0.1685 0.1660 0.1635 0.16111.0 0.1587 0.1562 0.1539 0.1515 0.1492 0.1469 0.1446 0.1423 0.1401 0.13791.1 0.1357 0.1335 0.1314 0.1292 0.1271 0.1251 0.1230 0.1210 0.1190 0.11701.2 0.1151 0.1131 0.1112 0.1093 0.1075 0.1056 0.1038 0.1020 0.1003 0.09851.3 0.0968 0.0951 0.0934 0.0918 0.0901 0.0885 0.0869 0.0853 0.0838 0.08231.4 0.0808 0.0793 0.0778 0.0764 0.0749 0.0735 0.0721 0.0708 0.0694 0.06811.5 0.0668 0.0655 0.0643 0.0630 0.0618 0.0606 0.0594 0.0582 0.0571 0.05591.6 0.0548 0.0537 0.0526 0.0516 0.0505 0.0495 0.0485 0.0475 0.0465 0.04551.7 0.0446 0.0436 0.0427 0.0418 0.0409 0.0401 0.0392 0.0384 0.0375 0.03671.8 0.0359 0.0351 0.0344 0.0336 0.0329 0.0322 0.0314 0.0307 0.0301 0.02941.9 0.0287 0.0281 0.0274 0.0268 0.0262 0.0256 0.0250 0.0244 0.0239 0.02332.0 0.0228 0.0222 0.0217 0.0212 0.0207 0.0202 0.0197 0.0192 0.0188 0.01832.1 0.0179 0.0174 0.0170 0.0166 0.0162 0.0158 0.0154 0.0150 0.0146 0.01432.2 0.0139 0.0136 0.0132 0.0129 0.0125 0.0122 0.0119 0.0116 0.0113 0.01102.3 0.0107 0.0104 0.0102 0.0099 0.0096 0.0094 0.0091 0.0089 0.0087 0.00842.4 0.0082 0.0080 0.0078 0.0075 0.0073 0.0071 0.0069 0.0068 0.0066 0.00642.5 0.0062 0.0060 0.0059 0.0057 0.0055 0.0054 0.0052 0.0051 0.0049 0.00482.6 0.0047 0.0045 0.0044 0.0043 0.0041 0.0040 0.0039 0.0038 0.0037 0.00362.7 0.0035 0.0034 0.0033 0.0032 0.0031 0.0030 0.0029 0.0028 0.0027 0.00262.8 0.0026 0.0025 0.0024 0.0023 0.0023 0.0022 0.0021 0.0021 0.0020 0.00192.9 0.0019 0.0018 0.0018 0.0017 0.0016 0.0016 0.0015 0.0015 0.0014 0.00143.0 0.0013 0.0013 0.0013 0.0012 0.0012 0.0011 0.0011 0.0011 0.0010 0.00103.1 0.0010 0.0009 0.0009 0.0009 0.0008 0.0008 0.0008 0.0008 0.0007 0.00073.2 0.0007 0.0007 0.0006 0.0006 0.0006 0.0006 0.0006 0.0005 0.0005 0.00053.3 0.0005 0.0005 0.0005 0.0004 0.0004 0.0004 0.0004 0.0004 0.0004 0.00033.4 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0002

Page 30: Discrete Distributions - Kennesaw State University

Table VI The t Distribution

P(T ! t)

t

0.1

0.2

0.3

0.4

−3 −2 −1 0 1 2 3 tα(r)

0.1

0.2

0.3

0.4

−3 −2 −1 0 1 2 3

α

P(T ≤ t) =∫ t

−∞

"[(r + 1)/2]√πr "(r/2)(1 + w2/r)(r+1)/2

dw

P(T ≤ −t) = 1 − P(T ≤ t)

P(T ≤ t)

0.60 0.75 0.90 0.95 0.975 0.99 0.995

r t0.40(r) t0.25(r) t0.10(r) t0.05(r) t0.025(r) t0.01(r) t0.005(r)

1 0.325 1.000 3.078 6.314 12.706 31.821 63.6572 0.289 0.816 1.886 2.920 4.303 6.965 9.9253 0.277 0.765 1.638 2.353 3.182 4.541 5.8414 0.271 0.741 1.533 2.132 2.776 3.747 4.6045 0.267 0.727 1.476 2.015 2.571 3.365 4.032

6 0.265 0.718 1.440 1.943 2.447 3.143 3.7077 0.263 0.711 1.415 1.895 2.365 2.998 3.4998 0.262 0.706 1.397 1.860 2.306 2.896 3.3559 0.261 0.703 1.383 1.833 2.262 2.821 3.250

10 0.260 0.700 1.372 1.812 2.228 2.764 3.169

11 0.260 0.697 1.363 1.796 2.201 2.718 3.10612 0.259 0.695 1.356 1.782 2.179 2.681 3.05513 0.259 0.694 1.350 1.771 2.160 2.650 3.01214 0.258 0.692 1.345 1.761 2.145 2.624 2.99715 0.258 0.691 1.341 1.753 2.131 2.602 2.947

16 0.258 0.690 1.337 1.746 2.120 2.583 2.92117 0.257 0.689 1.333 1.740 2.110 2.567 2.89818 0.257 0.688 1.330 1.734 2.101 2.552 2.87819 0.257 0.688 1.328 1.729 2.093 2.539 2.86120 0.257 0.687 1.325 1.725 2.086 2.528 2.845

21 0.257 0.686 1.323 1.721 2.080 2.518 2.83122 0.256 0.686 1.321 1.717 2.074 2.508 2.81923 0.256 0.685 1.319 1.714 2.069 2.500 2.80724 0.256 0.685 1.318 1.711 2.064 2.492 2.79725 0.256 0.684 1.316 1.708 2.060 2.485 2.787

26 0.256 0.684 1.315 1.706 2.056 2.479 2.77927 0.256 0.684 1.314 1.703 2.052 2.473 2.77128 0.256 0.683 1.313 1.701 2.048 2.467 2.76329 0.256 0.683 1.311 1.699 2.045 2.462 2.75630 0.256 0.683 1.310 1.697 2.042 2.457 2.750

∞ 0.253 0.674 1.282 1.645 1.960 2.326 2.576

This table is taken from Table III of Fisher and Yates: Statistical Tables for Biological, Agricultrual, and Medical Research, published by Longman Group Ltd.,London (previously published by Oliver and Boyd, Edinburgh).

Page 31: Discrete Distributions - Kennesaw State University

Appendix B Tables 497

Table VII The F Distribution

P(F ≤ f ) =∫ f

0

"[(r1 + r2)/2](r1/r2)r1/2wr1/2−1

"(r1/2)"(r2/2)(1 + r1w/r2)(r1+r2)/2dw

f0

0.2

0.4

0.6

P(F ! f )

1 2 3 4 5

Fα(4, 8)0

0.2

0.4

0.6

F(4, 8)

α

1 2 3 5

Page 32: Discrete Distributions - Kennesaw State University

498 Appendix B Tables

Table VII continued

P(F ≤ f ) =∫ f

0

"[(r1 + r2)/2](r1/r2)r1/2wr1/2−1

"(r1/2)"(r2/2)(1 + r1w/r2)(r1+r2)/2dw

Numerator Degrees of Freedom, r1Den.d.f.

α P(F ≤ f ) r2 1 2 3 4 5 6 7 8 9 10

0.05 0.95 1 161.4 199.5 215.7 224.6 230.2 234.0 236.8 238.9 240.5 241.90.025 0.975 647.79 799.50 864.16 899.58 921.85 937.11 948.22 956.66 963.28 968.630.01 0.99 4052 4999.5 5403 5625 5764 5859 5928 5981 6022 6056

0.05 0.95 2 18.51 19.00 19.16 19.25 19.30 19.33 19.35 19.37 19.38 19.400.025 0.975 38.51 39.00 39.17 39.25 39.30 39.33 39.36 39.37 39.39 39.400.01 0.99 98.50 99.00 99.17 99.25 99.30 99.33 99.36 99.37 99.39 99.40

0.05 0.95 3 10.13 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81 8.790.025 0.975 17.44 16.04 15.44 15.10 14.88 14.73 14.62 14.54 14.47 14.420.01 0.99 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49 27.35 27.23

0.05 0.95 4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 5.960.025 0.975 12.22 10.65 9.98 9.60 9.36 9.20 9.07 8.98 8.90 8.840.01 0.99 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80 14.66 14.55

0.05 0.95 5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77 4.740.025 0.975 10.01 8.43 7.76 7.39 7.15 6.98 6.85 6.76 6.68 6.620.01 0.99 16.26 13.27 12.06 11.39 10.97 10.67 10.46 10.29 10.16 10.05

0.05 0.95 6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 4.060.025 0.975 8.81 7.26 6.60 6.23 5.99 5.82 5.70 5.60 5.52 5.460.01 0.99 13.75 10.92 9.78 9.15 8.75 8.47 8.26 8.10 7.98 7.87

0.05 0.95 7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 3.640.025 0.975 8.07 6.54 5.89 5.52 5.29 5.12 4.99 4.90 4.82 4.760.01 0.99 12.25 9.55 8.45 7.85 7.46 7.19 6.99 6.84 6.72 6.62

0.05 0.95 8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 3.350.025 0.975 7.57 6.06 5.42 5.05 4.82 4.65 4.53 4.43 4.36 4.300.01 0.99 11.26 8.65 7.59 7.01 6.63 6.37 6.18 6.03 5.91 5.81

0.05 0.95 9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 3.140.025 0.975 7.21 5.71 5.08 4.72 4.48 4.32 4.20 4.10 4.03 3.960.01 0.99 10.56 8.02 6.99 6.42 6.06 5.80 5.61 5.47 5.35 5.26

0.05 0.95 10 4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02 2.980.025 0.975 6.94 5.46 4.83 4.47 4.24 4.07 3.95 3.85 3.78 3.720.01 0.99 10.04 7.56 6.55 5.99 5.64 5.39 5.20 5.06 4.94 4.85

Page 33: Discrete Distributions - Kennesaw State University

Appendix B Tables 499

Table VII continued

P(F ≤ f ) =∫ f

0

"[(r1 + r2)/2](r1/r2)r1/2wr1/2−1

"(r1/2)"(r2/2)(1 + r1w/r2)(r1+r2)/2dw

Numerator Degrees of Freedom, r1Den.d.f.

α P(F ≤ f ) r2 1 2 3 4 5 6 7 8 9 10

0.05 0.95 12 4.75 3.89 3.49 3.26 3.11 3.00 2.91 2.85 2.80 2.750.025 0.975 6.55 5.10 4.47 4.12 3.89 3.73 3.61 3.51 3.44 3.370.01 0.99 9.33 6.93 5.95 5.41 5.06 4.82 4.64 4.50 4.39 4.30

0.05 0.95 15 4.54 3.68 3.29 3.06 2.90 2.79 2.71 2.64 2.59 2.540.025 0.975 6.20 4.77 4.15 3.80 3.58 3.41 3.29 3.20 3.12 3.060.01 0.99 8.68 6.36 5.42 4.89 4.56 4.32 4.14 4.00 3.89 3.80

0.05 0.95 20 4.35 3.49 3.10 2.87 2.71 2.60 2.51 2.45 2.39 2.350.025 0.975 5.87 4.46 3.86 3.51 3.29 3.13 3.01 2.91 2.84 2.770.01 0.99 8.10 5.85 4.94 4.43 4.10 3.87 3.70 3.56 3.46 3.37

0.05 0.95 24 4.26 3.40 3.01 2.78 2.62 2.51 2.42 2.36 2.30 2.250.025 0.975 5.72 4.32 3.72 3.38 3.15 2.99 2.87 2.78 2.70 2.640.01 0.99 7.82 5.61 4.72 4.22 3.90 3.67 3.50 3.36 3.26 3.17

0.05 0.95 30 4.17 3.32 2.92 2.69 2.53 2.42 2.33 2.27 2.21 2.160.025 0.975 5.57 4.18 3.59 3.25 3.03 2.87 2.75 2.65 2.57 2.510.01 0.99 7.56 5.39 4.51 4.02 3.70 3.47 3.30 3.17 3.07 2.98

0.05 0.95 40 4.08 3.23 2.84 2.61 2.45 2.34 2.25 2.18 2.12 2.080.025 0.975 5.42 4.05 3.46 3.13 2.90 2.74 2.62 2.53 2.45 2.390.01 0.99 7.31 5.18 4.31 3.83 3.51 3.29 3.12 2.99 2.89 2.80

0.05 0.95 60 4.00 3.15 2.76 2.53 2.37 2.25 2.17 2.10 2.04 1.990.025 0.975 5.29 3.93 3.34 3.01 2.79 2.63 2.51 2.41 2.33 2.270.01 0.99 7.08 4.98 4.13 3.65 3.34 3.12 2.95 2.82 2.72 2.63

0.05 0.95 120 3.92 3.07 2.68 2.45 2.29 2.17 2.09 2.02 1.96 1.910.025 0.975 5.15 3.80 3.23 2.89 2.67 2.52 2.39 2.30 2.22 2.160.01 0.99 6.85 4.79 3.95 3.48 3.17 2.96 2.79 2.66 2.56 2.47

0.05 0.95 ∞ 3.84 3.00 2.60 2.37 2.21 2.10 2.01 1.94 1.88 1.830.025 0.975 5.02 3.69 3.12 2.79 2.57 2.41 2.29 2.19 2.11 2.050.01 0.99 6.63 4.61 3.78 3.32 3.02 2.80 2.64 2.51 2.41 2.32

Page 34: Discrete Distributions - Kennesaw State University

500 Appendix B Tables

Table VII continued

P(F ≤ f ) =∫ f

0

"[(r1 + r2)/2](r1/r2)r1/2wr1/2−1

"(r1/2)"(r2/2)(1 + r1w/r2)(r1+r2)/2dw

Numerator Degrees of Freedom, r1Den.d.f.

α P(F ≤ f ) r2 12 15 20 24 30 40 60 120 ∞

0.05 0.95 1 243.9 245.9 248.0 249.1 250.1 251.1 252.2 253.3 254.30.025 0.975 976.71 984.87 993.10 997.25 1001.4 1005.6 1009.8 1014.0 1018.30.01 0.99 6106 6157 6209 6235 6261 6287 6313 6339 6366

0.05 0.95 2 19.41 19.43 19.45 19.45 19.46 19.47 19.48 19.49 19.500.025 0.975 39.42 39.43 39.45 39.46 39.47 39.47 39.48 39.49 39.500.01 0.99 99.42 99.43 99.45 99.46 99.47 99.47 99.48 99.49 99.50

0.05 0.95 3 8.74 8.70 8.66 8.64 8.62 8.59 8.57 8.55 8.530.025 0.975 14.34 14.25 14.17 14.12 14.08 14.04 13.99 13.95 13.900.01 0.99 27.05 26.87 26.69 26.60 26.50 26.41 26.32 26.22 26.13

0.05 0.95 4 5.91 5.86 5.80 5.77 5.75 5.72 5.69 5.66 5.630.025 0.975 8.75 8.66 8.56 8.51 8.46 8.41 8.36 8.31 8.260.01 0.99 14.37 14.20 14.02 13.93 13.84 13.75 13.65 13.56 13.46

0.05 0.95 5 4.68 4.62 4.56 4.53 4.50 4.46 4.43 4.40 4.360.025 0.975 6.52 6.43 6.33 6.28 6.23 6.18 6.12 6.07 6.020.01 0.99 9.89 9.72 9.55 9.47 9.38 9.29 9.20 9.11 9.02

0.05 0.95 6 4.00 3.94 3.87 3.84 3.81 3.77 3.74 3.70 3.670.025 0.975 5.37 5.27 5.17 5.12 5.07 5.01 4.96 4.90 4.850.01 0.99 7.72 7.56 7.40 7.31 7.23 7.14 7.06 6.97 6.88

0.05 0.95 7 3.57 3.51 3.41 3.41 3.38 3.34 3.30 3.27 3.230.025 0.975 4.67 4.57 4.47 4.42 4.36 4.31 4.25 4.20 4.140.01 0.99 6.47 6.31 6.16 6.07 5.99 5.91 5.82 5.74 5.65

0.05 0.95 8 3.28 3.22 3.15 3.12 3.08 3.04 3.01 2.97 2.930.025 0.975 4.20 4.10 4.00 3.95 3.89 3.84 3.78 3.73 3.670.01 0.99 5.67 5.52 5.36 5.28 5.20 5.12 5.03 4.95 4.86

0.05 0.95 9 3.07 3.01 2.94 2.90 2.86 2.83 2.79 2.75 2.710.025 0.975 3.87 3.77 3.67 3.61 3.56 3.51 3.45 3.39 3.330.01 0.99 5.11 4.96 4.81 4.73 4.65 4.57 4.48 4.40 4.31

Page 35: Discrete Distributions - Kennesaw State University

Appendix B Tables 501

Table VII continued

P(F ≤ f ) =∫ f

0

"[(r1 + r2)/2](r1/r2)r1/2wr1/2−1

"(r1/2)"(r2/2)(1 + r1w/r2)(r1+r2)/2dw

Numerator Degrees of Freedom, r1Den.d.f.

α P(F ≤ f ) r2 12 15 20 24 30 40 60 120 ∞

0.05 0.95 10 2.91 2.85 2.77 2.74 2.70 2.66 2.62 2.58 2.540.025 0.975 3.62 3.52 3.42 3.37 3.31 3.26 3.20 3.14 3.080.01 0.99 4.71 4.56 4.41 4.33 4.25 4.17 4.08 4.00 3.91

0.05 0.95 12 2.69 2.62 2.54 2.51 2.47 2.43 2.38 2.34 2.300.025 0.975 3.28 3.18 3.07 3.02 2.96 2.91 2.85 2.79 2.720.01 0.99 4.16 4.01 3.86 3.78 3.70 3.62 3.54 3.45 3.36

0.05 0.95 15 2.48 2.40 2.33 2.29 2.25 2.20 2.16 2.11 2.070.025 0.975 2.96 2.86 2.76 2.70 2.64 2.59 2.52 2.46 2.400.01 0.99 3.67 3.52 3.37 3.29 3.21 3.13 3.05 2.96 2.87

0.05 0.95 20 2.28 2.20 2.12 2.08 2.04 1.99 1.95 1.90 1.840.025 0.975 2.68 2.57 2.46 2.41 2.35 2.29 2.22 2.16 2.090.01 0.99 3.23 3.09 2.94 2.86 2.78 2.69 2.61 2.52 2.42

0.05 0.95 24 2.18 2.11 2.03 1.98 1.94 1.89 1.84 1.79 1.730.025 0.975 2.54 2.44 2.33 2.27 2.21 2.15 2.08 2.01 1.940.01 0.99 3.03 2.89 2.74 2.66 2.58 2.49 2.40 2.31 2.21

0.05 0.95 30 2.09 2.01 1.93 1.89 1.84 1.79 1.74 1.68 1.620.025 0.975 2.41 2.31 2.20 2.14 2.07 2.01 1.94 1.87 1.790.01 0.99 2.84 2.70 2.55 2.47 2.39 2.30 2.21 2.11 2.01

0.05 0.95 40 2.00 1.92 1.84 1.79 1.74 1.69 1.64 1.58 1.510.025 0.975 2.29 2.18 2.07 2.01 1.94 1.88 1.80 1.72 1.640.01 0.99 2.66 2.52 2.37 2.29 2.20 2.11 2.02 1.92 1.80

0.05 0.95 60 1.92 1.84 1.75 1.70 1.65 1.59 1.53 1.47 1.390.025 0.975 2.17 2.06 1.94 1.88 1.82 1.74 1.67 1.58 1.480.01 0.99 2.50 2.35 2.20 2.12 2.03 1.94 1.84 1.73 1.60

0.05 0.95 120 1.83 1.75 1.66 1.61 1.55 1.50 1.43 1.35 1.250.025 0.975 2.05 1.95 1.82 1.76 1.69 1.61 1.53 1.43 1.310.01 0.99 2.34 2.19 2.03 1.95 1.86 1.76 1.66 1.53 1.38

0.05 0.95 ∞ 1.75 1.67 1.57 1.52 1.46 1.39 1.32 1.22 1.000.025 0.975 1.94 1.83 1.71 1.64 1.57 1.48 1.39 1.27 1.000.01 0.99 2.18 2.04 1.88 1.79 1.70 1.59 1.47 1.32 1.00

Page 36: Discrete Distributions - Kennesaw State University

502 Appendix B Tables

Table VIII Random Numbers on the Interval (0, 1)

3407 1440 6960 8675 5649 5793 15145044 9859 4658 7779 7986 0520 66970045 4999 4930 7408 7551 3124 05277536 1448 7843 4801 3147 3071 47497653 4231 1233 4409 0609 6448 2900

6157 1144 4779 0951 3757 9562 23546593 8668 4871 0946 3155 3941 96623187 7434 0315 4418 1569 1101 00434780 1071 6814 2733 7968 8541 10039414 6170 2581 1398 2429 4763 9192

1948 2360 7244 9682 5418 0596 49711843 0914 9705 7861 6861 7865 72934944 8903 0460 0188 0530 7790 91183882 3195 8287 3298 9532 9066 82256596 9009 2055 4081 4842 7852 5915

4793 2503 2906 6807 2028 1075 71752112 0232 5334 1443 7306 6418 96390743 1083 8071 9779 5973 1141 43938856 5352 3384 8891 9189 1680 31928027 4975 2346 5786 0693 5615 2047

3134 1688 4071 3766 0570 2142 34920633 9002 1305 2256 5956 9256 89798771 6069 1598 4275 6017 5946 81892672 1304 2186 8279 2430 4896 36983136 1916 8886 8617 9312 5070 2720

6490 7491 6562 5355 3794 3555 75108628 0501 4618 3364 6709 1289 05439270 0504 5018 7013 4423 2147 40895723 3807 4997 4699 2231 3193 81306228 8874 7271 2621 5746 6333 0345

7645 3379 8376 3030 0351 8290 36406842 5836 6203 6171 2698 4086 54696126 7792 9337 7773 7286 4236 17884956 0215 3468 8038 6144 9753 31311327 4736 6229 8965 7215 6458 3937

9188 1516 5279 5433 2254 5768 87180271 9627 9442 9217 4656 7603 88262127 1847 1331 5122 8332 8195 33222102 9201 2911 7318 7670 6079 26761706 6011 5280 5552 5180 4630 4747

7501 7635 2301 0889 6955 8113 43645705 1900 7144 8707 9065 8163 98463234 2599 3295 9160 8441 0085 93175641 4935 7971 8917 1978 5649 57992127 1868 3664 9376 1984 6315 8396

Page 37: Discrete Distributions - Kennesaw State University

Appendix B Tables 503Table IX Distribution Function of the Correlation Coefficient R, ρ = 0

R p.d.f.ν = 15 d.f.

r

1

−1 0

P(R ≤ r)

1

R p.d.f.ν = 15 d.f.

rα(v)

α

1

−1 0 1

P(R ≤ r) =∫ r

−1

"[(n − 1)/2]"(1/2)"[(n − 2)/2]

(1 − w2)(n−4)/2) dw

P(R ≤ r)

0.95 0.975 0.99 0.995ν = n − 2degrees offreedom r0.05(ν) r0.025(ν) r0.01(ν) r0.005(ν)

1 0.9877 0.9969 0.9995 0.99992 0.9000 0.9500 0.9800 0.99003 0.8053 0.8783 0.9343 0.95874 0.7292 0.8113 0.8822 0.91725 0.6694 0.7544 0.8329 0.8745

6 0.6215 0.7067 0.7887 0.83437 0.5822 0.6664 0.7497 0.79778 0.5493 0.6319 0.7154 0.76469 0.5214 0.6020 0.6850 0.7348

10 0.4972 0.5759 0.6581 0.7079

11 0.4761 0.5529 0.6338 0.683512 0.4575 0.5323 0.6120 0.661313 0.4408 0.5139 0.5922 0.641114 0.4258 0.4973 0.5742 0.622615 0.4123 0.4821 0.5577 0.6054

16 0.4000 0.4683 0.5425 0.589717 0.3887 0.4555 0.5285 0.575018 0.3783 0.4437 0.5154 0.561419 0.3687 0.4328 0.5033 0.548720 0.3597 0.4226 0.4920 0.5367

25 0.3232 0.3808 0.4450 0.486930 0.2959 0.3494 0.4092 0.448735 0.2746 0.3246 0.3809 0.418240 0.2572 0.3044 0.3578 0.393145 0.2428 0.2875 0.3383 0.3721

50 0.2306 0.2732 0.3218 0.354160 0.2108 0.2500 0.2948 0.324870 0.1954 0.2318 0.2736 0.301780 0.1829 0.2172 0.2565 0.282990 0.1725 0.2049 0.2422 0.2673

100 0.1638 0.1946 0.2300 0.2540

Page 38: Discrete Distributions - Kennesaw State University

504A

ppendixB

Tables

Table X Discrete Distributions

ProbabilityDistribution andParameter Values

ProbabilityMass

Function

Moment-Generating

FunctionMeanE(X) Variance Var(X) Examples

Bernoulli pxq1−x, x = 0, 1 q + pet, p pq Experiment with two possible0 < p < 1 −∞ < t < ∞ outcomes, say success andq = 1 − p failure, p = P(success)

Binomial(

nx

)pxqn−x, (q + pet)n, np npq Number of successes in

n = 1, 2, 3, . . . −∞ < t < ∞ a sequence of n Bernoulli0 < p < 1 x = 0, 1, . . . , n trials, p = P(success)

Geometric qx−1p,pet

1 − qet1p

qp2 The number of trials to

0 < p < 1 x = 1, 2, . . . obtain the first success in aq = 1 − p t < − ln(1 − p) sequence of Bernoulli trials

Hypergeometric Selecting n objects at randomx ≤ n, x ≤ N1

(N1

x

)(N2

n − x

)

(Nn

) n(

N1

N

)n(

N1

N

)(N2

N

)(N − nN − 1

)without replacement from a

n − x ≤ N2 set composed of twoN = N1 + N2 types of objectsN1 > 0, N2 > 0

Negative Binomial(

x − 1r − 1

)prqx−r,

(pet)r

(1 − qet)r ,rp

rqp2 The number of trials to

obtain the rth success in ar = 1, 2, 3, . . . x = r, r + 1, . . . t < − ln(1 − p) sequence of Bernoulli trials0 < p < 1

Poissonλxe−λ

x! , eλ(et−1) λ λ Number of events occurring inλ > 0 −∞ < t < ∞ a unit interval, events are

x = 0, 1, . . . occurring randomly at a meanrate of λ per unit interval

Uniform1m

, x = 1, 2, . . . , mm + 1

2m2 − 1

12Select an integer randomly

m > 0 from 1, 2, . . . , m

Page 39: Discrete Distributions - Kennesaw State University

Appendix

BTables

505

Table XI Continuous Distributions

ProbabilityDistribution andParameter Values Probability Density Function

Moment-Generating

FunctionMeanE(X) Variance Var(X) Examples

Beta"(α + β)"(α)"(β)

xα−1(1 − x)β−1,α

α + β

αβ

(α + β + 1)(α + β)2 X = X1/(X1 + X2),α > 0 where X1 and X2 haveβ > 0 0 < x < 1 independent gamma

distributions with same θ

Chi-squarexr/2−1e−x/2

"(r/2)2r/2 ,1

(1 − 2t)r/2 , t <12

r 2r Gamma distribution, θ = 2,r = 1, 2, . . . α = r/2; sum of squares of r

0 < x < ∞ independent N(0, 1) randomvariables

Exponential1θ

e−x/θ , 0 ≤ x < ∞ 11 − θ t

, t <1θ

θ θ2 Waiting time to first arrivalθ > 0 when observing a Poisson

process with a mean rate ofarrivals equal to λ = 1/θ

Gammaxα−1e−x/θ

"(α)θα,

1(1 − θ t)α

, t <1θ

αθ αθ2 Waiting time to αth arrivalα > 0 when observing a Poissonθ > 0 0 < x < ∞ process with a mean rate of

arrivals equal to λ = 1/θ

Normale−(x−µ)2/2σ 2

σ√

2π, eµt+σ 2t2/2 µ σ 2 Errors in measurements;

−∞ < µ < ∞ −∞ < t < ∞ heights of children;σ > 0 −∞ < x < ∞ breaking strengths

Uniform1

b − a, a ≤ x ≤ b

etb − eta

t(b − a), t %= 0

a + b2

(b − a)2

12Select a point at random

−∞ < a < b < ∞ from the interval [a, b]1, t = 0

Page 40: Discrete Distributions - Kennesaw State University

506A

ppendixB

Tables

Table XII Tests and Confidence Intervals

Distribution

θ : Theparameterof interest

W: The variable used to testH0: θ = θ0

Two-sided 1 − αConfidence Interval for θ Comments

N(µ, σ 2) or n large µX − θ0

σ/√

nx ± zα/2

σ√n

W is N(0, 1);σ 2 known P(W ≥ zα/2) = α/2

N(µ, σ 2) µX − θ0

S/√

nx ± tα/2(n−1)

s√n

W has a t distribution withσ 2 unknown n − 1 degrees of freedom;

P[W ≥ tα/2(n−1)] = α/2

Any distribution µX − θ0

σ/√

nx ± zα/2

σ√n

W has an approximatewith known N(0, 1) distribution forvariance, σ 2 n sufficiently large

N(µX , σ 2X) µX − µY

X − Y − θ0√σ 2

X

n+ σ 2

Y

m

x − y ± zα/2

√σ 2

X

n+ σ 2

Y

mW is N(0, 1)

N(µY , σ 2Y )

σ 2X , σ 2

Y known

N(µX , σ 2X) µX − µY

X − Y − θ0√S2

X

n+ S2

Y

m

x − y ± zα/2

√s2

xn

+s2

y

mW is approximately N(0, 1)

N(µY , σ 2Y ) if sample sizes are large

σ 2X , σ 2

Y unknown

N(µX , σ 2X) µX − µY

X − Y − θ0√(n − 1)S2

X + (m − 1)S2Y

n + m − 2

(1n

+ 1m

) x − y ± tα/2(n+m−2)sp

√1n

+ 1m

W has a t distribution withr = n + m − 2 degrees of

freedomN(µY , σ 2Y )

σ 2X = σ 2

Y , unknown sp =

√(n − 1)s2

x + (m − 1)s2y

n + m − 2

D = X − Y µX − µYD − θ0

SD/√

nd ± tα/2(n−1)

sd√n

W has a t distribution withis N(µX − µY , σ 2

D) n − 1 degrees of freedomX and Y dependent

Page 41: Discrete Distributions - Kennesaw State University

Appendix

BTables

507

Table XII continued

Distribution

θ : Theparameterof interest

W: The variable used to testH0: θ = θ0

Two-sided 1 − αConfidence Interval for θ Comments

N(µ, σ 2) σ 2 (n − 1)S2

θ0

(n − 1)s2

χ2α/2(n−1)

,(n − 1)s2

χ21−α/2(n−1)

W is χ2(n−1),µ unknown P[W ≤ χ2

1−α/2(n−1)] = α/2,P[W ≥ χ2

α/2(n−1)] = α/2

N(µ, σ 2) σ(n − 1)S2

θ20

√√√√ (n − 1)s2

χ2α/2(n−1)

,

√√√√ (n − 1)s2

χ21−α/2(n−1)

W is χ2(n−1).µ unknown P[W ≤ χ2

1−α/2(n−1)] = α/2,

P[W ≥ χ2α/2(n−1)] = α/2

N(µX , σ 2X)

σ 2X

σ 2Y

S2Y

S2X

θ0s2

x/s2y

Fα/2(n−1, m−1), Fα/2(m−1, n−1)

s2x

s2y

W has an F distribution with

N(µY , σ 2Y ) m − 1 and n − 1 degrees

µX , µY unknown of freedom

b(n, p) p

Yn

− θ0√(

Yn

)(1 − Y

n

)/n

yn

± zα/2

√( yn

)(1 − y

n

)/n W is approximately N(0, 1)

for n sufficiently large

b(n, p) p p ± zα/2√

p(1 − p)/(n + 4) W is approximately N(0, 1)p = (y + 2)/(n + 4) for n sufficiently large

b(n1, p1) p1 − p2

Y1

n1− Y2

n2− θ0

√(Y1 + Y2

n1 + n2

)(1 − Y1 + Y2

n1 + n2

)(1n1

+ 1n2

)y1

n1− y2

n2± W is approximately N(0, 1)

b(n2, p2) when n1 and n2 aresufficiently large

zα/2

√y1

n1

(1 − y1

n1

)/n1 + y2

n2

(1 − y2

n2

)/n2