Download pdf - Statistics (recap)

Transcript
Page 1: Statistics (recap)

Statistics (Recap)Finance & Management Students

Farzad Javidanrad

October 2013

University of Nottingham-Business School

Page 2: Statistics (recap)

Probabilityโ€ข Some Preliminary Concepts:

Random: Something that happens (occurs) by chance.

Population: A set of all possible outcome of a random experiment or a collection of all members of a specific group under study. This collection makes an space that all possible samples can be derived from. For that reason it is sometimes called sample space.

Sample: Any subset of population (sample space).

In tossing a die:

Random event is the event of appearing any face of the die.

Population (sample space) is the set of .

Sample is any subset of the set above such as or .

61,2,3,4,5,

3 6,4,2

Page 3: Statistics (recap)

Probabilityโ€ข Two events are mutually exclusive if they cannot happen together.

The occurrence of one of them prevents the occurrence of another. For example, if the baby is a boy it cannot be a girl and vice versa.

โ€ข Two events are independent if occurrence of one of them has no effect on the chance of occurrence of another. For example, the result of rolling a die has no impact on the outcome of flipping a coin. But in the experiment of taking two cards consecutively from a set of 52 cards (if the cards can be chosen equally likely) the chance of getting the second card is affected by the result of the first card.

โ€ข Two events are exhaustive if they include all possible outcomes together. For example, in rolling a die the possibility of having odd numbers or even numbers.

Page 4: Statistics (recap)

Probabilityโ€ข If event ๐‘จ can happen in ๐’Ž different ways out of ๐’ equally likely

ways, the probability of event ๐‘จ can be shown as its relative frequency; i.e. :

๐‘ƒ ๐ด =๐‘š

๐‘›

U: sample space (population)

A: an event (sample)

Aโ€™: mutually exclusive event with A

A & Aโ€™ are exhaustive collectively

No. of ways that event ๐ดoccurs

Total of equally likely and possible outcomes

๐ด๐ด

๐ดโ€ฒ

U

Page 5: Statistics (recap)

Probabilityโ€ข As 0 โ‰ค ๐‘š โ‰ค ๐‘› it can be concluded that

0 โ‰ค๐‘š

๐‘›โ‰ค 1

Or 0 โ‰ค ๐‘ƒ(๐ด) โ‰ค 1

โ€ข ๐‘ƒ ๐ด = 0 means that event ๐ด cannot happen and ๐‘ƒ ๐ด = 1means that the event will happen with certainty.

โ€ข With the definition of ๐ดโ€ฒ as an event of โ€œnon-occurrenceโ€ of event ๐ด, we can find that:

๐‘ƒ ๐ดโ€ฒ =๐‘› โˆ’๐‘š

๐‘›= 1 โˆ’

๐‘š

๐‘›= 1 โˆ’ ๐‘ƒ ๐ด

Or ๐‘ƒ ๐ด + ๐‘ƒ ๐ดโ€ฒ = 1

Page 6: Statistics (recap)

Probability of Multiple Eventsโ€ข If ๐‘จ and ๐‘ฉ are not mutually exclusive events so, the probability of

happening one of them (๐‘จ ๐‘œ๐‘Ÿ ๐‘ฉ) can be calculated as following:

๐‘ท ๐‘จ โˆช ๐‘ฉ = ๐‘ท ๐‘จ + ๐‘ท ๐‘ฉ โˆ’ ๐‘ท(๐‘จ โˆฉ ๐‘ฉ)

๐‘ƒ ๐ด ๐‘œ๐‘Ÿ ๐ต ๐‘ƒ ๐ด ๐‘Ž๐‘›๐‘‘ ๐ต

๐‘ƒ ๐ด ๐‘ƒ ๐ต

๐‘ƒ ๐ด โˆฉ ๐ต

Page 7: Statistics (recap)

Probability of Multiple Events

P(A)

P(B)P(C)

๐‘ƒ ๐ด โˆฉ ๐ต โˆฉ ๐ถ

In case, we are dealing with more events:

๐‘ท ๐‘จ โˆช ๐‘ฉ โˆช ๐‘ช = ๐‘ท ๐‘จ + ๐‘ท ๐‘ฉ + ๐‘ท ๐‘ช โˆ’ ๐‘ท ๐‘จ โˆฉ ๐‘ฉ โˆ’ ๐‘ท ๐‘จ โˆฉ ๐‘ช โˆ’๐‘ท ๐‘ฉ โˆฉ ๐‘ช + ๐‘ท(๐‘จ โˆฉ ๐‘ฉ โˆฉ ๐‘ช)

Page 8: Statistics (recap)

Probability of Multiple Eventsโ€ข Considering ๐‘ท ๐‘จ โˆช ๐‘ฉ = ๐‘ท ๐‘จ + ๐‘ท ๐‘ฉ โˆ’ ๐‘ท(๐‘จ โˆฉ ๐‘ฉ) we can have the

following situations:

1. If ๐‘จ and ๐‘ฉ are mutually exclusive events, then :

๐‘ท ๐‘จ โˆฉ ๐‘ฉ = ๐ŸŽ

2. If ๐‘จ and ๐‘ฉ are two independent events, then:

๐‘ท ๐‘จ โˆฉ ๐‘ฉ = ๐‘ท(๐‘จ) ร— ๐‘ท(๐‘ฉ)

3. If ๐‘จ and ๐‘ฉ are dependent events, then:

๐‘ท ๐‘จ โˆฉ ๐‘ฉ = ๐‘ท(๐‘จ) ร— ๐‘ท(๐‘ฉ ๐‘จ) = ๐‘ท(๐‘ฉ) ร— ๐‘ท(๐‘จ ๐‘ฉ)

Where ๐‘ท(๐‘จ ๐‘ฉ) and ๐‘ท(๐‘ฉ ๐‘จ) are conditional probabilities and in the case of ๐‘ท(๐‘จ ๐‘ฉ) means the probability of event ๐ด provided that event ๐ต has already happened.

Page 9: Statistics (recap)

Probability of Multiple Eventso The probability of picking at random a Heart or a Queen on a single

experiment from a card deck of 52 is:

๐‘ƒ ๐ป โˆช ๐‘„ = ๐‘ƒ ๐ป + ๐‘ƒ ๐‘„ โˆ’ ๐‘ƒ ๐ป โˆฉ ๐‘„ =13

52+

4

52โˆ’

1

52=

4

13

o The probability of getting a 1 or a 4 on a single toss of a fair die is:

๐‘ƒ 1 โˆช 4 = ๐‘ƒ 1 + ๐‘ƒ 4 =1

6+1

6=

1

3As they cannot happen together they are mutually exclusive events and ๐‘ƒ 1 โˆฉ 4 = 0.

o The probability of having two heads in the experiment of tossing two fair coins is: (two independent events)

๐‘ƒ ๐ป โˆฉ ๐ป =1

2.1

2=

1

4

Page 10: Statistics (recap)

Probability of Multiple Eventso The probability of picking two ace without returning the first card

into the batch of 52 playing cards, which represents a conditional probability, is:

๐‘ƒ 1๐‘ ๐‘ก ๐‘Ž๐‘๐‘’ โˆฉ 2๐‘›๐‘‘ ๐‘Ž๐‘๐‘’ = ๐‘ƒ(1๐‘ ๐‘ก ๐‘Ž๐‘๐‘’) ร— ๐‘ƒ(2๐‘›๐‘‘ ๐‘Ž๐‘๐‘’ 1๐‘ ๐‘ก ๐‘Ž๐‘๐‘’)

Or can be written with less words involved:

๐‘ƒ ๐ด1 โˆฉ ๐ด2 = ๐‘ƒ(๐ด1) ร— ๐‘ƒ(๐ด2 ๐ด1) =4

52ร—

3

51=

1

221

โ€ข If two events ๐‘จ and ๐‘ฉ are independent from each other then:

๐‘ท(๐‘จ ๐‘ฉ) = ๐‘ท ๐‘จ ๐’‚๐’๐’… ๐‘ท(๐‘ฉ ๐‘จ) = ๐‘ท(๐‘ฉ)

Page 11: Statistics (recap)

Random Variable & Probability Distribution

Some Basic Concepts:

โ€ข Variable: A letter (symbol) which represents the elements of a specific set.

โ€ข Random Variable: A variable whose values are randomly appear based on a probability distribution.

โ€ข Probability Distribution: A corresponding rule (function) which corresponds a probability to the values of a random variable.

โ€ข Variables (including random variables) are divided into two general categories:

1) Discrete Variables, and

2) Continuous Variables

Page 12: Statistics (recap)

Random Variable & Probability Distribution

โ€ข A discrete variable is the variable whose elements (values) can be

corresponded to the values of the natural numbers set or any subset

of that. So, it is possible to put an order and count its elements

(values). The number of elements can be finite or infinite.

โ€ข For a discrete variable it is not possible to define any neighbourhood, whatever small, at any value in its domain. There is a jump from one value to another value.

โ€ข If the elements of the domain of a variable can be corresponded to

the values of the real numbers set or any subset of that, the variable

is called continuous. It is not possible to order and count the

elements of a continuous variable. A variable is continuous if for any

value in its domain a neighbourhood, whatever small, can be defined.

Page 13: Statistics (recap)

Random Variable & Probability Distribution

โ€ข Probability Distribution: A rule (function) that associates a probability either to all possible elements of a random variable (RV) individually or a set of them in an interval.*

โ€ข For a discrete RV this rule associates a probability to each possible individual outcome. For example, the probability distribution for occurrence of a Head when filliping a fair coin: (Note: ๐‘ƒ๐‘– = 1)

๐’™ 0 1

๐‘ƒ(๐‘ฅ) 0.5 0.5In one trial ๐ป, ๐‘‡

๐’™ 0 1 2

๐‘ƒ(๐‘ฅ) 0.25 0.5 0.25

In two trials ๐ป๐ป,๐ป๐‘‡, ๐‘‡๐ป, ๐‘‡๐‘‡

๐’™ = ๐‘ท๐’“๐’Š๐’„๐’† (+1) --- (0) (-1)

๐‘ƒ(๐‘ฅ) 0.6 0.1 0.3

Change in the price of a share in one day

o The probability distribution for the price change of a share in stock market

Page 14: Statistics (recap)

Probability Distributions (Continuous)โ€ข The probability that a continuous random variable chooses

just one of its values in its domain is zero, because the number of all possible outcomes ๐’ is infinite and

๐’Ž

โˆžโ†’ ๐ŸŽ.

โ€ข For the above reason, the probability of a continuous random variable need to be calculated in an interval.

โ€ข The probability distribution of a continuous random variable is often called a probability density function (PDF) or simply probability function and it is usually shown by ๐’‡(๐’™) and it has following properties:

I. ๐‘“(๐‘ฅ) โ‰ฅ 0 (similar to ๐‘ท(๐’™) โ‰ฅ ๐ŸŽ for discrete RV*)

II. โˆ’โˆž

+โˆž๐‘“ ๐‘ฅ ๐‘‘๐‘ฅ = 1 (similar to ๐‘ท ๐’™ = ๐Ÿ for discrete RV)

III. ๐‘Ž๐‘๐‘“ ๐‘ฅ ๐‘‘๐‘ฅ = ๐‘ƒ ๐‘Ž โ‰ค ๐‘ฅ โ‰ค ๐‘ = ๐น ๐‘ โˆ’ ๐น ๐‘Ž (probability

given to set of values in an interval [a,b] )**

Page 15: Statistics (recap)

Probability Distributions (Continuous)โ€ข where ๐น(๐‘ฅ) is the integral of the PDF function (๐‘“(๐‘ฅ)) and it is

called as Cumulative Distribution Function (CDF) and for any real value of ๐’™ is defined as:

๐น(๐‘ฅ) โ‰ก ๐‘ƒ(๐‘‹ โ‰ค ๐‘ฅ)

CDF shows the area under PDF function (๐Ÿ(๐ฑ)) from โˆ’โˆž to ๐ฑ . For discrete random variable, CDF shows the summation of all probabilities before the value of ๐ฑ .

Adopted from http://beyondbitsandatomsblog.stanford.edu/spring2010/tag/embodied-artifacts/

๐น(๐‘ฅ)

๐‘“(๐‘ฅ)

๐น(๐‘ฅ)โ‰ก๐‘ƒ(๐‘‹โ‰ค๐‘ฅ)

๐น(๐‘ฅ)โ‰ก๐‘ƒ(๐‘‹โ‰ค๐‘ฅ)

Page 16: Statistics (recap)

Some Characteristics of Probability Distributions

โ€ข Expected Value (Probabilistic Mean Value): It is one of the most important measures which shows the central tendency of the distribution. It is the weighted average of all possible values of random variable ๐’™ and it is shown by ๐‘ฌ(๐’™).

โ€ข For a discreet RV (with n possible outcomes)

๐‘ฌ ๐’™ = ๐’™๐Ÿ๐‘ท ๐’™๐Ÿ + ๐’™๐Ÿ๐‘ท ๐’™๐Ÿ +โ‹ฏ+ ๐’™๐’๐‘ท ๐’™๐’ =

๐’Š=๐Ÿ

๐’

๐’™๐’Š๐‘ท(๐’™๐’Š)

โ€ข For a continuous RV

๐‘ฌ ๐’™ =

โˆ’โˆž

+โˆž

๐’™. ๐’‡ ๐’™ ๐’…๐’™

Page 17: Statistics (recap)

Some Characteristics of Probability Distributions

โ€ข Properties of ๐‘ฌ(๐’™):

i. If ๐’„ is a constant then ๐‘ฌ ๐’„ = ๐’„ .

ii. If ๐’‚ and ๐’ƒ are constants then ๐‘ฌ ๐’‚๐’™ + ๐’ƒ = ๐’‚๐‘ฌ ๐’™ + ๐’ƒ .

iii. If ๐’‚๐Ÿ, โ€ฆ , ๐’‚๐’ are constants then

๐‘ฌ ๐’‚๐Ÿ๐’™๐Ÿ +โ‹ฏ+ ๐’‚๐’๐’™๐’ = ๐’‚๐Ÿ๐‘ฌ ๐’™๐Ÿ +โ‹ฏ+ ๐’‚๐’๐‘ฌ(๐’™๐’)

Or

๐‘ฌ(

๐’Š=๐Ÿ

๐’

๐’‚๐’Š๐’™๐’Š) =

๐’Š=๐Ÿ

๐’

๐’‚๐’Š๐‘ฌ(๐’™๐’Š)

iv. If ๐’™ and ๐’š are independent random variables then

๐‘ฌ ๐’™๐’š = ๐‘ฌ ๐’™ . ๐‘ฌ ๐’š

Page 18: Statistics (recap)

Some Characteristics of Probability Distributions

v. If ๐’ˆ ๐’™ is a function of random variable ๐’™ then

๐‘ฌ ๐’ˆ ๐’™ = ๐’ˆ ๐’™ .๐‘ท(๐’™)

๐‘ฌ ๐’ˆ ๐’™ = ๐’ˆ ๐’™ . ๐’‡ ๐’™ ๐’…๐’™

โ€ข Variance: To measure how random variable ๐’™ is dispersed around its expected value, variance can help. If we show ๐‘ฌ ๐’™ = ๐ , then

๐’—๐’‚๐’“ ๐’™ = ๐ˆ๐Ÿ = ๐‘ฌ[ ๐’™ โˆ’ ๐‘ฌ ๐’™๐Ÿ]

= ๐‘ฌ[ ๐’™ โˆ’ ๐ ๐Ÿ]

= ๐‘ฌ[๐’™๐Ÿ โˆ’ ๐Ÿ๐’™๐ + ๐๐Ÿ]

= ๐‘ฌ ๐’™๐Ÿ โˆ’ ๐Ÿ๐๐‘ฌ ๐’™ + ๐๐Ÿ

= ๐‘ฌ ๐’™๐Ÿ โˆ’ ๐๐Ÿ

For discreet RV

For continuous RV

Page 19: Statistics (recap)

Some Characteristics of Probability Distributions

๐’—๐’‚๐’“ ๐’™ =

๐’Š=๐Ÿ

๐’

๐’™๐’Š โˆ’ ๐ ๐Ÿ. ๐‘ท(๐’™)

๐’—๐’‚๐’“ ๐’™ = โˆ’โˆž+โˆž

๐’™๐’Š โˆ’ ๐ ๐Ÿ. ๐’‡ ๐’™ ๐’…๐’™

โ€ข Properties of Variance:

i. if ๐’„ is a constant then ๐’—๐’‚๐’“ ๐’„ = ๐ŸŽ .

ii. If ๐’‚ and ๐’ƒ are constants then ๐’—๐’‚๐’“ ๐’‚๐’™ + ๐’ƒ = ๐’‚๐Ÿ๐’—๐’‚๐’“(๐’™) .

iii. If ๐’™ and ๐’š are independent random variables then

๐’—๐’‚๐’“ ๐’™ ยฑ ๐’š = ๐’—๐’‚๐’“ ๐’™ + ๐’—๐’‚๐’“(๐’š)

can be extended to more variables

For discreet RV

For continuous RV

Page 20: Statistics (recap)

โ€ข Some of the well-known probability distributions are:

โ€ข The Binomial Distribution:

1. The probability of the occurrence of an event is ๐’‘ and is not changing.

2. The experiment is repeated for ๐’ times.

3. The probability that out of ๐’ times, the event appears ๐’™ times is:

๐‘ƒ ๐‘ฅ =๐‘›!

๐‘ฅ! ๐‘› โˆ’ ๐‘ฅ !๐‘๐‘ฅ(1 โˆ’ ๐‘)๐‘›โˆ’๐‘ฅ

The mean value and standard deviation of the binomial distribution are:

๐œ‡ = ๐‘–=0๐‘› ๐‘ฅ๐‘– . ๐‘ƒ ๐‘ฅ๐‘– =๐‘›๐‘ ๐œŽ = ๐‘–=0

๐‘› ๐‘ฅ๐‘– โˆ’ ๐œ‡ 2. ๐‘ƒ(๐‘ฅ๐‘–) = ๐‘›๐‘(1 โˆ’ ๐‘)

So, to show that the probability distribution of the random variable ๐‘‹is binomial we can write: ๐‘‹~๐ต๐‘–(๐‘›๐‘, ๐‘›๐‘ 1 โˆ’ ๐‘ )

Probability Distributions (Discrete RV)

Page 21: Statistics (recap)

Probability Distributions (Discrete RV)โ€ข A gambler thinks his chance to get a 1 in rolling a die is high. What

is his chance to have 4 one out of six experiments using a fair die?

The probability of having a one in an individual trial is 1

6and it

remains the same in all 6 experiments. So,

๐‘ƒ ๐‘ฅ = 4 =6!

4! 2!

1

6

45

6

2

=375

7776= 0.048 โ‰ˆ 5%

โ€ข The Poisson Distribution:

1. It is used to calculate the probability of number of desired event (no. of successes)in a specific period of time.

2. The average number of desired event (no. of successes) per unit of time remains constant.

Page 22: Statistics (recap)

โ€ข So, the probability of having ๐’™ numbers of success is calculated by:

๐‘ƒ ๐‘ฅ =๐€๐‘ฅ๐‘’โˆ’๐€

๐‘ฅ!

Where ๐€ is the average number of successes in a specific period of time and ๐‘’ = 2.7182 .

โ€ข The mean value and standard deviation of the Poisson distribution are:

๐œ‡ =

๐‘–=0

๐‘›

๐‘ฅ๐‘– . ๐‘ƒ ๐‘ฅ๐‘– =๐€ and ๐œŽ =

๐‘–=0

๐‘›

๐‘ฅ๐‘– โˆ’ ๐œ‡ 2. ๐‘ƒ(๐‘ฅ๐‘–) = ๐€

So, to show that the probability distribution of the random variable ๐‘‹ is Poisson we can write: ๐‘ฟ~Poi(๐€, ๐€).

o The emergency section in a hospital receives 2 calls per half an hour (4 calls in an hour). The probability of getting just 2 calls in a randomly chosen hour in a random day is:

๐‘ƒ ๐‘ฅ = 2 =42๐‘’โˆ’4

2!= 0.146 โ‰ˆ 15%

Probability Distributions (Discrete RV)

Page 23: Statistics (recap)

The Normal Distribution (Continuous RV)โ€ข The Normal Distribution: It is the best known probability

distribution which reflects the nature of most random variables in the world. The probability density function (PDF) of normal distribution is:

1. Symmetrical around its mean value (๐).

2. Bell-shaped, with two tails approaching the horizontal axis asymptotically as we move further away from the mean.

Adopted from http://www.pdnotebook.com/2010/06/statistical-tolerance-analysis-root-sum-square/

Page 24: Statistics (recap)

The Normal Distribution (Continuous RV)3. The probability density function (PDF) of normal distribution

can be represented by:

๐’‡ ๐’™ =๐Ÿ

๐ˆ ๐Ÿ๐…๐’†โˆ’

๐’™โˆ’๐ ๐Ÿ

๐Ÿ๐ˆ๐Ÿ (โˆ’โˆž < ๐’™ < +โˆž)

Where ๐ and ๐ˆ are mean and standard deviation respectively.

๐ = โˆ’โˆž+โˆž

๐’™. ๐’‡ ๐’™ ๐’…๐’™ and ๐ˆ = โˆ’โˆž+โˆž

๐’™ โˆ’ ๐ ๐Ÿ . ๐’‡ ๐’™ ๐’…๐’™

So, ๐‘ฟ~๐‘ต(๐, ๐ˆ๐Ÿ).

โ€ข A linear combination of independent normally distributed random variables is itself normally distributed, that is,

If ๐‘ฟ~๐‘ต ๐๐Ÿ, ๐ˆ๐Ÿ๐Ÿ and ๐’€~๐‘ต ๐๐Ÿ, ๐ˆ๐Ÿ

๐Ÿ and if ๐’ = ๐’‚๐‘ฟ + ๐’ƒ๐’€ then

๐’~๐‘ต(๐’‚๐๐Ÿ + ๐’ƒ๐๐Ÿ , ๐’‚๐Ÿ๐ˆ๐Ÿ

๐Ÿ + ๐’ƒ๐Ÿ๐ˆ๐Ÿ๐Ÿ)

โ€ข This can be extended to more than two random variables.

Page 25: Statistics (recap)

The Normal Distribution (Continuous RV)โ€ข Recalling the last property of PDF (

๐‘Ž

๐‘๐‘“ ๐‘ฅ ๐‘‘๐‘ฅ = ๐‘ƒ(๐‘Ž โ‰ค ๐‘ฅ โ‰ค ๐‘)), it is

difficult to calculate the probability using the above PDF with different values of ๐ and ๐ˆ. The solution for this problem is to transform the normal variable ๐’™ to the standardised normal variable (or simply, standard normal

variable) random variable ๐’› , by: ๐’› =๐’™โˆ’๐

๐ˆ

which its parameters (๐œ‡ and ๐œŽ2) are independent from the influence of other random variablesโ€™ parameters with normal distribution because we always have:๐‘ฌ ๐’› = ๐ŸŽ and ๐’—๐’‚๐’“ ๐’› = ๐Ÿ (why?)

โ€ข The probability distribution for the standard normal variable is defined as:

๐’‡ ๐’› =๐Ÿ

๐Ÿ๐…๐’†โˆ’

๐’›๐Ÿ

๐Ÿ ๐’~๐‘ต(๐ŸŽ, ๐Ÿ).

Standardised

Adopted and amended from http://www.mathsisfun.com/data/standard-normal-distribution.html

๐‘ฟ~๐‘ต(๐, ๐ˆ๐Ÿ) ๐’~๐‘ต(๐ŸŽ, ๐Ÿ)

Page 26: Statistics (recap)

The Standard Normal Distribution

0

โ€ข Properties of the standard normal distribution curve:

1. It is symmetrical around y-axis.

2. The area under the curve can be split into two equal areas, that is:

โˆ’โˆž

0

๐‘“ ๐‘ง ๐‘‘๐‘ง =

0

+โˆž

๐‘“ ๐‘ง ๐‘‘๐‘ง = 0.5

โ€ข To find the area under the curve and before ๐’›๐Ÿ = ๐Ÿ. ๐Ÿ๐Ÿ” , using the z-table (next slide), we have:

๐‘ƒ ๐‘ง โ‰ค ๐‘ง1 = 1.26 =

โˆ’โˆž

0

๐‘“ ๐‘ง ๐‘‘๐‘ง +

0

๐‘ง1

๐‘“ ๐‘ง ๐‘‘๐‘ง =0.5 + 0.3962 = 0.8962 โ‰ˆ 90%

๐‘“(๐‘ง)

50%

๐‘ง

50% 50%

๐’›๐Ÿ = ๐Ÿ. ๐Ÿ๐Ÿ”

0.5

0.3

96

2

Page 27: Statistics (recap)

Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359

0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753

0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141

0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517

0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879

0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224

0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549

0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852

0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133

0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389

1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621

1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830

1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015

1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177

1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319

1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441

1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545

1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633

1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706

1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767

2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817

2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857

2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890

2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916

2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936

2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952

2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964

2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974

2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981

2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986

3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990

Page 28: Statistics (recap)

Working with the Z-Tableโ€ข To find the probability

๐‘ƒ 0.89 < ๐‘ง < 1.5 =

0

๐‘ง2

๐‘“(๐‘ง)๐‘‘๐‘ง โˆ’

0

๐‘ง1

๐‘“ ๐‘ง ๐‘‘๐‘ง

= ๐น 1.5 โˆ’ ๐น 0.89 = 0.4332 โˆ’ 0.3133

= 0.119 โ‰ˆ 12%

as both values are positive.

โ€ข To find the probability in the negative area we

need to find the equivalent area in the positive side:

๐‘ƒ โˆ’1.32 < ๐‘ง < โˆ’1.25 = ๐‘ƒ 1.25 < ๐‘ง < 1.32

= ๐น 1.32 โˆ’ ๐น 1.25

= 0.4066 โˆ’ 0.3944 = 0.0122 โ‰ˆ 1%

1.50.89

Page 29: Statistics (recap)

Working with the Z-Tableโ€ข To find ๐‘ƒ(โˆ’2.15 < ๐‘ง) we can write:

โˆ’โˆž

โˆ’2.15

๐‘“. ๐‘‘๐‘ง =

โˆ’โˆž

0

๐‘“. ๐‘‘๐‘ง โˆ’

โˆ’2.15

0

๐‘“. ๐‘‘๐‘ง

= 0.5 โˆ’ 0.4842 = 0.0158 โ‰ˆ 2%

โ€ข And finally, to find ๐‘ƒ(๐‘ง โ‰ฅ 1.93) , we have:

1.93

+โˆž

๐‘“. ๐‘‘๐‘ง =

0

+โˆž

๐‘“. ๐‘‘๐‘ง โˆ’

0

1.93

๐‘“. ๐‘‘๐‘ง

= 0.5 โˆ’ 0.4732 = 0.0268

0-2.15 =โ‰ก

0

2.15

๐‘“. ๐‘‘๐‘ง

0 =1.93

Page 30: Statistics (recap)

An Exampleo If the income of employees in a big company normally distributed

with ๐ = ยฃ๐Ÿ๐ŸŽ๐ŸŽ๐ŸŽ๐ŸŽ and ๐ˆ = ยฃ๐Ÿ’๐ŸŽ๐ŸŽ๐ŸŽ, what is the probability of an employee picked randomly have an income

a) above ยฃ22000, b) between ยฃ16000 and ยฃ24000.

a) We need to transform ๐’™ to ๐’› firstly:

๐‘ƒ ๐‘ฅ > 22000 = ๐‘ƒ๐‘ฅ โˆ’ 20000

4000>

22000 โˆ’ 20000

4000

= ๐‘ƒ ๐‘ง > 0.5 = 0.5 โˆ’ 01915 = 0.3085 โ‰ˆ 31%

b) ๐‘ƒ 16000 < ๐‘ฅ < 24000 = ๐‘ƒ(16000โˆ’20000

4000<

๐‘ฅโˆ’20000

4000<

24000โˆ’20000

4000)

= ๐‘ƒ โˆ’1 < ๐‘ง < 1

= 0.3413 + 0.3413

= 0.6826 โ‰ˆ 68%

Page 31: Statistics (recap)

The 2(Chi-Squared)Distributionโ€ข The ๐Ÿ(Chi-Squared)Distribution:

Let ๐’๐Ÿ, ๐’๐Ÿ, โ€ฆ , ๐’๐’Œbe ๐’Œ independent standardised normal distributed random variables, then the sum of the squares of them

๐‘‹ =

๐‘–=1

๐‘˜

๐‘๐‘–2

have a Chi-Square distribution with a degree of freedom equal to the number of random variables (๐’…๐’‡ = ๐’Œ). So, ๐‘ฟ~ .

The mean value and standard

deviation of the RV with a Chi-Squared

distribution are ๐’Œ ๐‘Ž๐‘›๐‘‘ ๐Ÿ๐’Œ

Respectively. So we can write:

๐‘ฟ~

2

k

Probability Density Function (PDF) of 2 Distribution

Ad

op

ted fro

m h

ttp://2012b

oo

ks.lardb

ucket.o

rg/bo

oks/b

eginn

ing-statistics/s1

5-ch

i-squ

are-tests-an

d-f-tests.h

tml

Page 32: Statistics (recap)

Ad

op

ted

from

http

://ww

w.d

ocsto

c.com

/do

cs/80811

492/chi--sq

uare

-table

๐‘ƒ ๐‘ฅ2 = 32 ๐‘‘๐‘“ = 16 = 0.01 or ๐‘ฅ20.01 ,16 = 32

Page 33: Statistics (recap)

The t-Distributionโ€ข If ๐’~๐‘ต ๐ŸŽ, ๐Ÿ and ๐‘ฟ~ and two random variables

๐’ and ๐‘ฟ are independent then the random variable

๐’• =๐’

๐‘ฟ ๐’Œ

=๐’. ๐’Œ

๐‘ฟ

follows studentโ€™s t-distribution (t-distribution) with ๐’Œ degree of freedom. For a sample size ๐’ we have ๐’…๐’‡ = ๐’Œ = ๐’ โˆ’ ๐Ÿ.

โ€ข The mean value and standard deviation of this distribution are

๐ = ๐ŸŽ ๐’ > ๐Ÿ

๐’–๐’๐’…๐’†๐’‡๐’Š๐’๐’†๐’… ๐’ = ๐Ÿ, ๐Ÿ๐ˆ =

๐’โˆ’๐Ÿ

๐’โˆ’๐Ÿ‘๐’ > ๐Ÿ‘

โˆž ๐’ = ๐Ÿ‘๐’–๐’๐’…๐’†๐’‡๐’Š๐’๐’†๐’… ๐’ = ๐Ÿ, ๐Ÿ

)2,(2 kkk

Page 34: Statistics (recap)

The t-Distributionโ€ข The t-distribution like the standard normal distribution is a bell-

shaped and symmetrical distribution with zero mean (n>2) but it is flatter but as the degree of freedom increases (or ๐’ increases)it approaches the standard normal distribution and for ๐’โ‰ฅ๐Ÿ‘๐ŸŽ their behaviours are similar.

โ€ข From the table (next slide)

๐‘ƒ ๐‘ก = 1.706 ๐‘‘๐‘“ =26 = 0.05 โ‰ˆ 5% or ๐‘ก0.05,26 = 1.706

Ad

op

ted

fro

m h

ttp

://e

du

cati

on

-p

ort

al.c

om

/aca

de

my/

less

on

/wh

at-i

s-a-

t-te

st-p

roce

du

re-

inte

rpre

tati

on

-e

xam

ple

s.h

tml#

less

on

= ๐Ÿ. ๐Ÿ•๐ŸŽ๐Ÿ”

5%

Page 35: Statistics (recap)

df 0.20 0.15 0.10 0.05 0.025 0.01 0.005 0.0025 0.001 0.0005

1 1.376 1.963 3.078 6.314 12.706 31.821 63.656 127.321 318.289 636.578

2 1.061 1.386 1.886 2.920 4.303 6.965 9.925 14.089 22.328 31.600

3 0.978 1.250 1.638 2.353 3.182 4.541 5.841 7.453 10.214 12.924

4 0.941 1.190 1.533 2.132 2.776 3.747 4.604 5.598 7.173 8.610

5 0.920 1.156 1.476 2.015 2.571 3.365 4.032 4.773 5.894 6.869

6 0.906 1.134 1.440 1.943 2.447 3.143 3.707 4.317 5.208 5.959

7 0.896 1.119 1.415 1.895 2.365 2.998 3.499 4.029 4.785 5.408

8 0.889 1.108 1.397 1.860 2.306 2.896 3.355 3.833 4.501 5.041

9 0.883 1.100 1.383 1.833 2.262 2.821 3.250 3.690 4.297 4.781

10 0.879 1.093 1.372 1.812 2.228 2.764 3.169 3.581 4.144 4.587

11 0.876 1.088 1.363 1.796 2.201 2.718 3.106 3.497 4.025 4.437

12 0.873 1.083 1.356 1.782 2.179 2.681 3.055 3.428 3.930 4.318

13 0.870 1.079 1.350 1.771 2.160 2.650 3.012 3.372 3.852 4.221

14 0.868 1.076 1.345 1.761 2.145 2.624 2.977 3.326 3.787 4.140

15 0.866 1.074 1.341 1.753 2.131 2.602 2.947 3.286 3.733 4.073

16 0.865 1.071 1.337 1.746 2.120 2.583 2.921 3.252 3.686 4.015

17 0.863 1.069 1.333 1.740 2.110 2.567 2.898 3.222 3.646 3.965

18 0.862 1.067 1.330 1.734 2.101 2.552 2.878 3.197 3.610 3.922

19 0.861 1.066 1.328 1.729 2.093 2.539 2.861 3.174 3.579 3.883

20 0.860 1.064 1.325 1.725 2.086 2.528 2.845 3.153 3.552 3.850

21 0.859 1.063 1.323 1.721 2.080 2.518 2.831 3.135 3.527 3.819

22 0.858 1.061 1.321 1.717 2.074 2.508 2.819 3.119 3.505 3.792

23 0.858 1.060 1.319 1.714 2.069 2.500 2.807 3.104 3.485 3.768

24 0.857 1.059 1.318 1.711 2.064 2.492 2.797 3.091 3.467 3.745

25 0.856 1.058 1.316 1.708 2.060 2.485 2.787 3.078 3.450 3.725

26 0.856 1.058 1.315 1.706 2.056 2.479 2.779 3.067 3.435 3.707

27 0.855 1.057 1.314 1.703 2.052 2.473 2.771 3.057 3.421 3.689

28 0.855 1.056 1.313 1.701 2.048 2.467 2.763 3.047 3.408 3.674

29 0.854 1.055 1.311 1.699 2.045 2.462 2.756 3.038 3.396 3.660

30 0.854 1.055 1.310 1.697 2.042 2.457 2.750 3.030 3.385 3.646

31 0.853 1.054 1.309 1.696 2.040 2.453 2.744 3.022 3.375 3.633

32 0.853 1.054 1.309 1.694 2.037 2.449 2.738 3.015 3.365 3.622

33 0.853 1.053 1.308 1.692 2.035 2.445 2.733 3.008 3.356 3.611

34 0.852 1.052 1.307 1.691 2.032 2.441 2.728 3.002 3.348 3.601

35 0.852 1.052 1.306 1.690 2.030 2.438 2.724 2.996 3.340 3.591

36 0.852 1.052 1.306 1.688 2.028 2.434 2.719 2.990 3.333 3.582

37 0.851 1.051 1.305 1.687 2.026 2.431 2.715 2.985 3.326 3.574

38 0.851 1.051 1.304 1.686 2.024 2.429 2.712 2.980 3.319 3.566

39 0.851 1.050 1.304 1.685 2.023 2.426 2.708 2.976 3.313 3.558

40 0.851 1.050 1.303 1.684 2.021 2.423 2.704 2.971 3.307 3.551

50 0.849 1.047 1.299 1.676 2.009 2.403 2.678 2.937 3.261 3.496

60 0.848 1.045 1.296 1.671 2.000 2.390 2.660 2.915 3.232 3.460

80 0.846 1.043 1.292 1.664 1.990 2.374 2.639 2.887 3.195 3.416

100 0.845 1.042 1.290 1.660 1.984 2.364 2.626 2.871 3.174 3.390

150 0.844 1.040 1.287 1.655 1.976 2.351 2.609 2.849 3.145 3.357

Infinity 0.842 1.036 1.282 1.645 1.960 2.326 2.576 2.807 3.090 3.290

Page 36: Statistics (recap)

The F Distributionโ€ข If ๐‘1~ and ๐‘2~ and ๐‘1 and ๐‘2 are independent then the

random variable

๐น =

๐‘1๐‘˜1

๐‘2

๐‘˜2

follows F distribution with ๐‘˜1 and ๐‘˜2 degrees of freedom, i.e.:

๐น~๐น๐‘˜1,๐‘˜2 or ๐น~๐น(๐‘˜1, ๐‘˜2)

โ€ข This distribution is skewed to

the right as the Chi-Square

distribution but as ๐‘˜1 and ๐‘˜2increase (๐‘› โ†’ โˆž) it approaches

to normal distribution.

2

2k2

1k

Ad

op

ted

from

h

ttp://w

ww

.vose

softw

are.com

/Mo

de

lRiskH

elp

/ind

ex.h

tm#D

istrib

utio

ns/C

on

tinu

ou

s_distrib

utio

ns/F_d

istribu

tion

.htm

Page 37: Statistics (recap)

The F Distributionโ€ข The mean and standard deviation of the F distribution are:

๐œ‡ =๐‘˜2

๐‘˜2โˆ’2๐‘“๐‘œ๐‘Ÿ (๐‘˜2 > 2) and

๐œŽ =๐‘˜2

๐‘˜2โˆ’2

2(๐‘˜1+๐‘˜2โˆ’2)

๐‘˜1(๐‘˜2โˆ’4)๐‘“๐‘œ๐‘Ÿ (๐‘˜2 > 4)

โ€ข Relation between t & Chi-Square Distributions with F distribution:

โ€ข For a random variable ๐‘‹~๐‘ก๐‘˜it can be shown that ๐‘‹2~๐น1,๐‘˜. This can also be written as

๐‘ก๐‘˜2 = ๐น1,๐‘˜

โ€ข If ๐‘˜2 is large enough, then ๐‘˜1. ๐น๐‘˜1,๐‘˜2~

2

1k

Page 38: Statistics (recap)

๐›ผ = 0.25All adopted from http://www.stat.purdue.edu/~yuzhu/stat514s05/tables.html

Page 39: Statistics (recap)

๐›ผ = 0.10

Page 40: Statistics (recap)

๐›ผ = 0.05

Page 41: Statistics (recap)

๐›ผ = 0.025

Page 42: Statistics (recap)

๐›ผ = 0.01

Page 43: Statistics (recap)

Statistical Inference (Estimation)โ€ข Statistical inference or statistical induction is one of the most

important aspect of decision making and it refers to the process of drawing a conclusion about the unknown parameters of the population from a sample of randomly chosen data.

โ€ข So, the idea is that a sample of randomly chosen data provides the best information about parameters of the population and it can be considered as a representative of the population when its size reasonably (appropriately) large.

โ€ข The first step in statistical inference (induction) is estimation which is the process of finding an estimate or approximation for the population parameters (such as mean value and standard deviation) using the data in the sample.

Page 44: Statistics (recap)

Statistical Inference (Estimation)โ€ข The value of ๐‘ฟ (sample mean) in a randomly chosen and

appropriately large sample is a good estimator of the population mean ๐ . The value of ๐’”๐Ÿ(sample variance) is also a good estimator of the population variance ๐ˆ๐Ÿ.

โ€ข Before taking any sample from population (when the sample is not realised or observed) we can talk about the probability distribution of a hypothetical sample. The probability distribution of a random variable ๐’™ in a hypothetical sample follows the probability distribution of the population even if the sampling process is repeated for many times.

โ€ข But the probability distribution of the sample mean ๐‘ฟ in repeated sampling does not necessarily follow the probability distribution of its population when number of sampling increases.

Page 45: Statistics (recap)

Central Limit Theoremโ€ข Central Limit Theorem:

Imagine random variable ๐‘ฟ with any probability distribution is defined in a population with the mean ๐ and the variance ๐ˆ๐Ÿ. If we get ๐’ independent samples ๐‘ฟ๐Ÿ, ๐‘ฟ๐Ÿ, โ€ฆ , ๐‘ฟ๐’ and for each sample we

calculate the mean values ๐‘ฟ๐Ÿ, ๐‘ฟ๐Ÿ, โ€ฆ , ๐‘ฟ๐’(see figure below)

๐‘ฟ~๐’Š. ๐’Š. ๐’…(๐, ๐ˆ๐Ÿ)

๐‘ฟ๐Ÿ

๐‘ฟ๐Ÿ

โ‹ฎ

๐‘ฟ๐’

๐‘–. ๐‘–. ๐‘‘ โ‰กIndependent & Identically Distributed RVs

Page 46: Statistics (recap)

Central Limit TheoremAs the number of sampling increases infinitely, the random variable ๐‘ฟhas a normal distribution (regardless of the population distribution) and we have

๐‘ฟ~๐‘ต ๐,๐ˆ๐Ÿ

๐’when ๐’ โ†’ +โˆž

And in the standard form:

๐’ = ๐‘ฟ โˆ’ ๐ ๐‘ฟ

๐ˆ ๐‘ฟ=

๐‘ฟ โˆ’ ๐๐ˆ

๐’

=๐’ ๐‘ฟ โˆ’ ๐

๐ˆ~๐‘ต(๐ŸŽ, ๐Ÿ)

o Taking sample of 36 elements from a population with the mean of 20 and standard deviation of 12, what is the probability that the sample mean falls between 18 and 24?

๐‘ƒ 18 < ๐‘ฅ < 24 = ๐‘ƒ โˆ’1 < ๐‘ฅ โˆ’ 20

12

36

< 2 = 0.3413 + 0.4772 โ‰ˆ 82%

Page 47: Statistics (recap)

Estimationโ€ข In previous slides we introduced some of the most important

probability distributions for discrete & continuous random variables.

โ€ข In many cases we know the nature of the probability distribution of a random variable, defined in a population, but have no idea about its parameters such as mean value or/and standard deviation.

โ€ข Point Estimation:

โ€ข To estimate the unknown parameters of a probability distribution of a random variable we can either have a point estimation or an interval estimation using an estimator.

โ€ข The estimator is a function of the sample values ๐’™๐Ÿ, ๐’™๐Ÿ, โ€ฆ , ๐’™๐’ and it

is often called a statistic. If ๐œฝ represent that estimator we have: ๐œฝ = ๐’‡(๐’™๐Ÿ, ๐’™๐Ÿ, โ€ฆ , ๐’™๐’)

Page 48: Statistics (recap)

Estimationโ€ข ๐œฝ is said to be an unbiased estimator of true ๐œฝ (parameter of the

population) if ๐‘ฌ ๐œฝ = ๐œฝ. Because the bias itself is defined as

๐‘ฉ๐’Š๐’‚๐’” = ๐‘ฌ ๐œฝ โˆ’ ๐œฝ

o For example, the sample mean ๐‘ฟ is a point and unbiased estimator for the unknown parameter ๐ (population mean):

๐œฝ = ๐‘ฟ = ๐’‡ ๐’™๐Ÿ, ๐’™๐Ÿ, โ€ฆ , ๐’™๐’ =๐Ÿ

๐’๐’™๐Ÿ + ๐’™๐Ÿ +โ‹ฏ+ ๐’™๐’

It is unbiased because ๐‘ฌ ๐‘ฟ = ๐.

Page 49: Statistics (recap)

โ€ข The sample variance in the form of ๐’”๐Ÿ = ๐’™๐’Šโˆ’ ๐’™

๐Ÿ

๐’is a point but a

biased estimator of the population variance ๐ˆ๐Ÿ in a small sample:

๐‘ฌ ๐’”๐Ÿ = ๐ˆ๐Ÿ(๐Ÿ โˆ’๐Ÿ

๐’) โ‰  ๐ˆ๐Ÿ

But it is a consistent estimator because it will approaches to ๐ˆ๐Ÿwhen the sample size ๐’ increases indefinitely (๐’ โ†’ โˆž)

โ€ข With Besselโ€™s correction (changing ๐’ to (๐’ โˆ’ ๐Ÿ)) we can define another sample variance which is unbiased even for small sample size.

๐’”๐Ÿ = ๐’™๐’Š โˆ’ ๐’™ ๐Ÿ

๐’ โˆ’ ๐Ÿ

โ€ข The methods of finding point estimators are mostly least-square method and maximum likelihood method which among them the first method will be discussed later.

Estimation

Page 50: Statistics (recap)

Interval Estimationโ€ข Interval Estimation:

โ€ข Interval estimation, in contrary, provides an interval or a range of possible estimates at a specific level of probability, which is called level of confidence, within which the true value of the population parameter may lie.

โ€ข If ๐œฝ๐Ÿ and ๐œฝ๐Ÿ are respectively the lowest and highest estimates of ๐œฝ

the probability that ๐œฝ is covered by the interval ๐œฝ๐Ÿ, ๐œฝ๐Ÿ is:

๐๐ซ ๐œฝ๐Ÿ โ‰ค ๐œฝ โ‰ค ๐œฝ๐Ÿ = ๐Ÿ โˆ’ ๐œถ (0 < ๐›ผ < 1)

Where ๐Ÿ โˆ’ ๐œถ is the level of confidence and ๐œถ itself is called level of

significance. The interval ๐œฝ๐Ÿ, ๐œฝ๐Ÿ is called confidence interval.

Page 51: Statistics (recap)

Interval Estimation How to find ๐œฝ๐Ÿ ๐’‚๐’๐’… ๐œฝ๐Ÿ? In order to find the lower and upper limits of a confidence interval we need to have a prior knowledge about the nature of distribution of the random variable in the population. If random variable ๐’™ is normally distributed in the population and the

population standard deviation (๐ˆ) is known, the 95% confidence interval for the unknown population mean (๐) can be constructed by finding the symmetric z-values associated to 95% area under the standard normal curve:

๐Ÿ โˆ’ ๐œถ = ๐Ÿ—๐Ÿ“% โ†’ ๐œถ = ๐Ÿ“% โ†’๐œถ

๐Ÿ= ๐Ÿ. ๐Ÿ“%

So, ยฑ๐’๐ŸŽ.๐ŸŽ๐Ÿ๐Ÿ“ = ยฑ๐Ÿ. ๐Ÿ—๐Ÿ”

We know that: ๐’ = ๐‘ฟโˆ’๐ ๐‘ฟ

๐ˆ ๐‘ฟ=

๐‘ฟโˆ’๐๐ˆ

๐’

, so:

๐‘ท(โˆ’๐’ ๐œถ ๐Ÿโ‰ค ๐’ โ‰ค ๐’ ๐œถ ๐Ÿ

) = ๐Ÿ—๐Ÿ“%

Adopted & altered from http://upload.wikimedia.org/wikipedia/en/b/bf/NormalDist1.96.png

=1โˆ’๐›ผ

๐œถ

๐Ÿ= ๐ŸŽ. ๐ŸŽ๐Ÿ๐Ÿ“

๐œถ

๐Ÿ= ๐ŸŽ. ๐ŸŽ๐Ÿ๐Ÿ“

โˆ’๐’ ๐œถ ๐Ÿ= = ๐’ ๐œถ ๐Ÿ

Page 52: Statistics (recap)

Interval Estimationโ€ข So we can write:

๐‘ท ๐’™ โˆ’ ๐Ÿ. ๐Ÿ—๐Ÿ”๐ˆ ๐’™ โ‰ค ๐ โ‰ค ๐’™ + ๐Ÿ. ๐Ÿ—๐Ÿ”๐ˆ ๐’™ = ๐ŸŽ. ๐Ÿ—๐Ÿ“

Or

๐‘ท ๐’™ โˆ’ ๐Ÿ. ๐Ÿ—๐Ÿ”๐ˆ

๐’โ‰ค ๐ โ‰ค ๐’™ + ๐Ÿ. ๐Ÿ—๐Ÿ”

๐ˆ

๐’= ๐ŸŽ. ๐Ÿ—๐Ÿ“

Therefore, the interval ๐’™ โˆ’ ๐Ÿ. ๐Ÿ—๐Ÿ”๐ˆ

๐’, ๐’™ + ๐Ÿ. ๐Ÿ—๐Ÿ”

๐ˆ

๐’represents a 95%

confidence interval (๐ถ๐ผ95%)of the unknown value of ๐.

It means in repeated random sampling (for 100 times) we

expect 95 out of 100 intervals, such as the above, cover the

unknown value of the population mean ๐ .

๐’™ ฬ…โˆ’๐Ÿ.๐Ÿ—๐Ÿ” ๐ˆ/โˆš๐’ = = ๐’™ ฬ…โˆ’๐Ÿ.๐Ÿ—๐Ÿ” ๐ˆ/โˆš๐’Adopted and altered from http://forums.anarchy-online.com/showthread.php?t=604728

Page 53: Statistics (recap)

Interval Estimation for population Proportion

A confidence interval can be constructed for the population proportion (see the graph below)

๐‘‹~๐ต๐‘–(๐‘›๐‘, ๐‘›๐‘ 1 โˆ’ ๐‘ )

๐’‘๐Ÿ

๐œ‡ ๐œŽ2

๐’‘๐Ÿ

โ‹ฎ

๐’‘๐’

๐ ๐’‘ = ๐‘ฌ ๐’‘ = ๐’‘ =๐

๐’

๐ˆ๐Ÿ ๐’‘= ๐’—๐’‚๐’“ ๐’‘ =

๐ˆ๐Ÿ

๐’๐Ÿ=

๐’‘(๐Ÿ โˆ’ ๐’‘)

๐’

๐’‘ in each sample represents a

sample proportion. In repeated random sampling ๐’‘ has its own probability distribution with mean value and

variance

Page 54: Statistics (recap)

Interval Estimation for population Proportionโ€ข The 90% confidence interval for the population proportion ๐’‘ when

sample size is bigger than 30 (n>30) and there is no information about the population variance will be constructed as following:

ยฑ๐’ ๐œถ ๐Ÿ=

๐’‘ โˆ’ ๐’‘

๐’‘(๐Ÿ โˆ’ ๐’‘)๐’

๐‘ท(โˆ’๐’ ๐œถ ๐Ÿโ‰ค ๐’ โ‰ค +๐’ ๐œถ ๐Ÿ

) = ๐Ÿ โˆ’ ๐œถ

๐‘ท( ๐’‘ โˆ’ ๐’ ๐œถ ๐Ÿ.

๐’‘(๐Ÿโˆ’ ๐’‘)

๐’โ‰ค ๐’‘ โ‰ค ๐’‘+๐’ ๐œถ ๐Ÿ

. ๐’‘(๐Ÿโˆ’ ๐’‘)

๐’) = ๐ŸŽ. ๐Ÿ—

So, the confidence interval can be simply written as:

๐‘ช๐‘ฐ๐Ÿ—๐ŸŽ% = ๐’‘ โˆ“ ๐Ÿ. ๐Ÿ”๐Ÿ’๐Ÿ“ ๐’‘(๐Ÿ โˆ’ ๐’‘)

๐’ =90% ๐œถ ๐Ÿ = ๐ŸŽ. ๐ŸŽ๐Ÿ“ ๐œถ ๐Ÿ = ๐ŸŽ. ๐ŸŽ๐Ÿ“

โˆ’๐’ ๐œถ ๐Ÿ= โˆ’๐Ÿ. ๐Ÿ”๐Ÿ’๐Ÿ“ ๐’ ๐œถ ๐Ÿ

= ๐Ÿ. ๐Ÿ”๐Ÿ’๐Ÿ“

Obviously, if we had knowledge about the

population variance we were be able to estimate

the population proportion ๐’‘ directly.

Why?

Adopted and altered fromhttp://www.stat.wmich.edu/s216/book/node83.html

Page 55: Statistics (recap)

Exampleso Imagine the weight of people in a society distributed normally. A

random sample of 25 with the sample mean 72 kg is taken from this society. If the standard deviation of the population is 6 kg find a)the 90% b)95% and c) 99% confidence interval for the unknown population mean.

a) 1 โˆ’ ๐›ผ = 0.9 โ†’๐›ผ

2= 0.05 โ†’ ๐‘ ๐›ผ 2

= 1.645

So, ๐ถ๐ผ90% = 72 ยฑ 1.645 ร—6

25= 70.03 , 73.97

b) 1 โˆ’ ๐›ผ = 0.95 โ†’๐›ผ

2= 0.025 โ†’ ๐‘ ๐›ผ 2

= 1.96

So, ๐ถ๐ผ95% = 72 ยฑ 1.96 ร—6

25= 69.65 , 74.35

c) 1 โˆ’ ๐›ผ = 0.99 โ†’๐›ผ

2= 0.005 โ†’ ๐‘ ๐›ผ 2

= 2.58

So, ๐ถ๐ผ99% = 72 ยฑ 2.58 ร—6

25= 68.9 , 75.1

Page 56: Statistics (recap)

Exampleso Samples from one of the lines of production in a factory suggests

that 10% of products are defective. If the range of 1% difference between sample and population proportion is acceptable what sample size we need to construct a 95% confidence interval for the population proportion? What about if the acceptable gap between sample & population proportion increased to 3%?

1 โˆ’ ๐›ผ = 0.95 โ†’๐›ผ

2= 0.025 โ†’ ๐‘ ๐›ผ 2

= 1.96

๐‘ ๐›ผ 2=

๐‘ โˆ’ ๐‘

๐‘(1 โˆ’ ๐‘)๐‘›

โ†’ 1.96 =0.01

0.1 ร— 0.9๐‘›

โ†’ ๐‘› = 196 ร— 0.3 2 โ‰ˆ 3458

If the gap increases to 3% then:

1.96 =0.03

0.1ร—0.9

๐‘›

โ†’ ๐‘› = 196 ร— 0.1 2 โ‰ˆ 385

Page 57: Statistics (recap)

Interval Estimation (Using t-distribution) โ€ข If the population standard deviation ๐ˆ is unknown and we use

sample standard deviation ๐’” instead, and the size of the sample is less than 30 (๐’ < ๐Ÿ‘๐ŸŽ) then the random variable

๐’™ โˆ’ ๐๐’”

๐’

~๐’•๐’โˆ’๐Ÿ

has t-distribution with ๐’…๐’‡ = ๐’ โˆ’ ๐Ÿ.

This means a confidence interval for the population mean ๐ will be in the form of:

๐‘ช๐‘ฐ(๐Ÿโˆ’๐œถ) = ๐’™ โˆ’ ๐’• ๐œถ ๐Ÿ,๐’โˆ’๐Ÿ

๐’”

๐’, ๐’™ + ๐’• ๐œถ ๐Ÿ,๐’โˆ’๐Ÿ

๐’”

๐’

โˆ’๐’• ๐œถ๐Ÿ,๐’โˆ’๐Ÿ

๐’• ๐œถ๐Ÿ,๐’โˆ’๐Ÿ

1 โˆ’ ๐›ผ % ๐œถ

๐Ÿ

๐œถ

๐Ÿ

Adopted and altered from http://cnx.org/content/m46278/latest/?collection=col11521/latest

Page 58: Statistics (recap)

Interval Estimationโ€ข The following flowchart can help to choose between Z and t-

distributions when the interval estimation is constructed for ๐ in the population.

Use nonparametric

methods

Adopted from http://www.expertsmind.com/questions/flow-chart-for-confidence-interval-30112489.aspx

Page 59: Statistics (recap)

Interval Estimationโ€ข Here there is a list of confidence intervals for the subject parameters

in the population.

Adopted from http://www.bls-stats.org/uploads/1/7/6/7/1767713/250709.image0.jpg

Page 60: Statistics (recap)

Hypothesis Testing โ€ข Hypothesis testing is one of the important aspects of statistical inference.

The main idea is to find out if some claims/statements (in the form of hypothesis) about population parameters can be statistically rejected by the evidence from the sample using a test statistic (a function of sample).

โ€ข Claims can be made in the form of null hypothesis (๐ป0) against the alternative hypothesis (๐ป1) and they are just rejectable. These two hypotheses should be mutually exclusive and collectively exhaustive. For example:

๐ป0: ๐œ‡ = 0.8 ๐‘Ž๐‘”๐‘Ž๐‘–๐‘›๐‘ ๐‘ก ๐ป1: ๐œ‡ โ‰  0.8

๐ป0: ๐œ‡ โ‰ฅ 2.1 ๐‘Ž๐‘”๐‘Ž๐‘–๐‘›๐‘ ๐‘ก ๐ป1: ๐œ‡ < 2.1

๐ป0: ๐œŽ2 โ‰ค 0.4 ๐‘Ž๐‘”๐‘Ž๐‘–๐‘›๐‘ ๐‘ก ๐ป1: ๐œŽ

2 > 0.4

Always remember that the equality sign comes with ๐ป0.

โ€ข If the value of the test statistic lies in the rejection area(s) the null hypothesis must be rejected, otherwise the sample does not provide sufficient evidence to reject the null hypothesis.

Page 61: Statistics (recap)

Hypothesis Testing โ€ข Assuming we know the distribution of the random variable in the

population and also having statistical independence between different random variables, in hypothesis testing we need to follow the following steps:

1. Stating the relevant null & alternative hypotheses. The state of the null hypothesis (being =,โ‰ฅ,โ‰ค something)indicates how many rejection regions we will have (for = sign we will have two regions and for others just one region; depending on the difference between the value of estimator and the claimed value for the population parameter the rejection area could be on the right or left of the distribution curve).

๐ป0: ๐œ‡ = 0.5

๐ป1: ๐œ‡ โ‰  0.5

๐ป0: ๐œ‡ โ‰ฅ 0.5 (๐‘œ๐‘Ÿ ๐œ‡ โ‰ค 0.5)

๐ป1: ๐œ‡ < 0.5 (๐‘œ๐‘Ÿ ๐œ‡ > 0.5)Graphs Adopted from http://www.soc.napier.ac.uk/~cs181/Modules/CM/Statistics/Statistics%203.html

Page 62: Statistics (recap)

Hypothesis Testing 2. Identifying the level of significance of the test (๐œถ) and it is usually

considered to be 5% or 1%, depending on the nature of the test and the goals of researcher. When ๐œถ is known with the prior knowledge about the sample distribution, the critical region(s) (or rejection area(s)) can be identified.

Here we have two critical values for standard normal

distributions associated to the level

of significance ๐›ผ =5% and ๐›ผ = 1%

Adopted from http://www.psychstat.missouristate.edu/introbook/sbk26.htm

๐‘๐›ผ=1.65

๐‘๐›ผ=2.33

Page 63: Statistics (recap)

Hypothesis Testing 3. Constructing a test statistic (a function based on the sample distribution &

sample size). This function is used to decide whther or not to reject ๐‘ฏ๐ŸŽ.

Table

Ad

op

ted

from

http

://ww

w.b

ls-stats.org/u

plo

ads/1/7/6/7/1

76771

3/250714.im

age0.jp

g

Here we have a list of some of

the test statistics

for testing different

hypotheses

Page 64: Statistics (recap)

Hypothesis Testing 4. Taking a random sample from the population and calculating the value of the test statistic. If the value is in the rejection area the null hypothesis ๐‘ฏ๐ŸŽ

will be rejected in favour of the alternative ๐‘ฏ๐Ÿat the predetermined significance level ๐œถ, otherwise the sample does not provide sufficient evidence to reject ๐‘ฏ๐ŸŽ (this does not mean that we accept ๐‘ฏ๐ŸŽ)

Adopted from http://www.onekobo.com/Articles/Statistics/03-Hypotheses/Stats3%20-%2010%20-%20Rejection%20Region.htm

โˆ’๐’๐œถ ๐‘œ๐‘Ÿ โˆ’ ๐’•๐œถ,๐’…๐’‡ if there is a left-tail test

โˆ’๐’๐œถ

๐Ÿ๐‘œ๐‘Ÿ โˆ’ ๐’•๐œถ

๐Ÿ,๐’…๐’‡ if there is a two-tail test

+๐’๐œถ ๐‘œ๐‘Ÿ + ๐’•๐œถ,๐’…๐’‡ if there is a right-tail test

+๐’๐œถ

๐Ÿ๐‘œ๐‘Ÿ + ๐’•๐œถ

๐Ÿ,๐’…๐’‡ if there is a two-tail test

Page 65: Statistics (recap)

Exampleo A chocolate factory claims that its new tin of cocoa powder contains at

least 500 gr of the powder. A standard checking agency takes a random sample of ๐‘› = 25 of the tins and found out that sample mean weight of tins is ๐‘‹ = 520 ๐‘”๐‘Ÿ and the sample standard deviation is ๐‘  = 75 ๐‘”๐‘Ÿ. If we assume the weight of cocoa powder in tins has a normal distribution, does the sample provide enough evidence to support the claim at 95% level of confidence?

1. ๐ป0: ๐œ‡ โ‰ฅ 500

๐ป1: ๐œ‡ < 500 (so, it is a one-tail test)

2. Level of significance ๐›ผ = 5% โ†’ ๐‘ก๐›ผ2,(๐‘›โˆ’1) = ๐‘ก0.05,24 = 1.711 (it is t-

distribution because ๐‘› < 30 and we do not have a prior knowledge about the population standard deviation)

3. The value of the test statistics is : ๐‘ก =๐‘‹โˆ’๐œ‡

๐‘ 

๐‘›

=520โˆ’500

75

25

= 1.33

4. As 1.33 < 1.711 we are not in the rejection area so, the claim cannot be rejected at 5% level of significance.

Page 66: Statistics (recap)

Type I & Type II Errorsโ€ข Two types of errors can occur in hypothesis testing:

A. Type I error; when based on our sample we reject a true null hypothesis.

B. Type II error; when based on our sample we cannot reject a false null hypothesis.

โ€ข By reducing the level of significance ๐œถ we can reduce the probability of making type I error (why?) however, at the same time, we increase the probability of making type II error.

โ€ข What would happen to type I and type II errors if we increase the sample size? (Hint: look at the confidence intervals)

Adopted from http://whatilearned.wikia.com/wiki/Hypothesis_Testing?file=Type_I_and_Type_II_Error_Table.jpg

Page 67: Statistics (recap)

Type I & Type II Errorsโ€ข The following graph shows how a change of the critical line (critical

value) changes the probability of making type I and type II errors:

๐‘ท ๐‘ป๐’š๐’‘๐’† ๐‘ฐ ๐’†๐’“๐’“๐’๐’“ = ๐œถ

And

๐‘ท ๐‘ป๐’š๐’‘๐’† ๐‘ฐ๐‘ฐ ๐’†๐’“๐’“๐’๐’“ = ๐œท

Adopted from http://www.weibull.com/hotwire/issue88/relbasics88.htm

The Power Of a Test:

The power of a test is the probability that the test will correctly reject the null hypothesis. It is

the probability of not committing type II error. The power is

equal to ๐Ÿ โˆ’ ๐œท which means by reducing ๐œทthe power of the test

will increase.

Page 68: Statistics (recap)

The P-Valueโ€ข It is not unusual to reject ๐ป0 at some level of significance, for

example ๐›ผ = 5% , but being unable to reject it at some other levels, e.g. ๐›ผ = 1% . The dependence of the final decision to the value of ๐›ผ is the weak point of the classical approach.

โ€ข In the new approach, we try to find p-value which is the lowest significance level at which ๐ป0 can be rejected. If the level of significance is determined at 5% and the lowest significance level at which ๐ป0 can be rejected (p-value) is 2% so the null hypothesis should be rejected; i.e.

๐’‘ โˆ’ ๐’—๐’‚๐’๐’–๐’† < ๐œถ

To understand this concept better letโ€™s look at an example:

โ€ข Suppose we believe that the mean life expectancy of the people in a city is 75 years (๐ป0: ๐œ‡ = 75). But our observation shows a sample mean of 76 years for a sample size of 100 with a sample variance of 4 years.

Reject ๐ป0

Page 69: Statistics (recap)

The P-Valueโ€ข The Z-score (test statistic) can be calculated as following:

โ€ข At 5% level of significance the critical Z-value is 1.96 so we must reject ๐‘ฏ๐ŸŽ. But, we should not have had this result (or should not have had those observations in our random sample) from the beginning if our assumption about the population mean ๐ was correct.

โ€ข The p-value is the probability of

having these type of results

or even worse than that (i.e. a Z-score

bigger than 2.5) considering the null

hypothesis was correct,

๐‘ท(๐’ โ‰ฅ ๐Ÿ. ๐Ÿ“ ๐ = ๐Ÿ•๐Ÿ“) = ๐’‘ โˆ’ ๐’—๐’‚๐’๐’–๐’† โ‰ˆ ๐ŸŽ. ๐ŸŽ๐ŸŽ๐Ÿ” (it means in 1000 samples this type of results can happen theoretically 6 times; but it has happened in our first random sampling).

๐‘ =๐‘‹ โˆ’ ๐œ‡๐‘ ๐‘›

=76 โˆ’ 75

4

100

= 2.5

Z=2.5

๐‘ท ๐’ โ‰ฅ ๐Ÿ. ๐Ÿ“โ‰ˆ ๐ŸŽ. ๐ŸŽ๐ŸŽ๐Ÿ”

http

://faculty.elgin

.edu

/dke

rnler/statistics/ch

10/10

-2.htm

l

Page 70: Statistics (recap)

The P-Valueโ€ข As we cannot deny what we have observed and obtained from the

sample, eventually we need to change our belief about the population mean and reject our assumption about that.

โ€ข The smaller the p-value, the stronger evidence against ๐ป0.


Recommended