Quantitative Methods for Finance - Lecture 2

Quantitative methods for finance

Lecture 2

Serafeim Tsoukas

Probability

• Probability underlies statistical inference – the

drawing of conclusions from a sample of data.

• If samples are drawn at random, their

characteristics (such as the sample mean) depend

upon chance.

• Hence to understand how to interpret sample

evidence, we need to understand chance, or

probability.

The definition of probability

• The probability of an event A may defined in

different ways:

The frequentist view: the proportion of trials in

which the event occurs, calculated as if the

number of trials approaches infinity.

The subjective view: someone’s degree of belief

about the likelihood of an event occurring.

Some vocabulary

• An experiment: an activity such as tossing a coin,

which has a range of possible outcomes.

• A trial: a single performance of the experiment.

• The sample space: all possible outcomes of the

experiment. For a single toss of a coin, the

sample space is {heads, tails}.

Probabilities

• With each outcome in the sample space we can

associate a probability (calculated according to

either the frequentist or subjective view).

• Pr(heads) = 1/2

Pr(tails) = 1/2.

• This is an example of a probability distribution

(more detail in Chapter 3).

Rules for probabilities

• 0 Pr(A) 1

• , summed over all outcomes

• Pr(not-A) = 1 − Pr(A)

1p

The complement

of the event A

Examples: picking a card from a pack

• The probability of picking any one card

from a pack (e.g. king of spades) is 1/52.

This is the same for each card.

• Summing over all cards: 1/52 + 1/52 + ⋯ = 1.

• Pr(not-king of spades) = 51/52 = 1 − Pr(king of

spades).

Compound events

• Often we need to calculate more complicated

probabilities:

What is the probability of drawing any spade?

What is the probability of throwing a ‘double six’

with two dice?

What is the probability of collecting a sample of

people whose average IQ is greater than 100?

• These are compound events.

Rules for calculating compound probabilities

1. The addition rule: the ‘or’ rule

Pr(A or B) = Pr(A) + Pr(B)

The probability of rolling a five or six on a single

roll of a die is

Pr(5 or 6) = Pr(5) + Pr(6) = 1/6 + 1/6 = 1/3.

1 2 3 4 5 6

A slight complication

• If A and B can simultaneously occur, the previous

formula gives the wrong answer Pr(king or heart) = 4/52 + 13/52 = 17/52

• This double counts the king of hearts

• 16 dots highlighted.

A K Q J 10 9 8 7 6 5 4 3 2

Spades • • • • • • • • • • • • •

Hearts • • • • • • • • • • • • •

Diamonds • • • • • • • • • • • • •

Clubs • • • • • • • • • • • • •

• We therefore subtract the king of hearts:

Pr(king or heart) = 4/52 + 13/52 − 1/52 = 16/52.

• The formula is therefore

Pr(A or B) = Pr(A) + Pr(B) − Pr(A and B).

• When A and B cannot occur simultaneously, Pr(A

and B) = 0.

A slight complication (Continued)

The multiplication rule

• When you want to calculate Pr(A and B):

Pr(A and B) = Pr(A) Pr(B).

• The probability of obtaining a double-six when

rolling two dice is

Pr(Six and Six) = Pr(Six) Pr(Six)

= 1/6 1/6 = 1/36.

• This… …is very unlikely!

Another slight complication:

independence

• Pr(drawing two aces from a pack of cards, without replacement).

• If the first card drawn is an ace (P = 4/52), that leaves 51 cards, of which 3 are aces.

• The probability of drawing the second ace is 3/51, different from the probability of drawing the first ace. They are not independent events. The probability changes.

• Pr(two aces) = 4/52 3/51 = 1/221.

Conditional probability

• 3/51 is the probability of drawing an ace given

that an ace was drawn as the first card.

• This is the conditional probability and is written

Pr(Second ace | ace on first draw).

• To simplify notation write as Pr(A2|A1).

• This is the probability of event A2 occurring, given

A1 has occurred.

• Consider Pr(A2 | not-A1)

• A ‘not-ace’ is drawn first, leaving 51 cards of

which 4 are aces

• Hence Pr(A2 | not-A1) = 4/51

• So Pr(A2 | not-A1) Pr(A2 | A1)

• They are not independent events.

Conditional probability (Continued)

The multiplication rule again

• The general rule is

Pr(A and B) = Pr(A) Pr(B|A).

• For independent events

Pr(B|A) = Pr(B|not-A) = Pr(B)

• and so

Pr(A and B) = Pr(A) Pr(B).

Combining the rules

• Pr(1 head in two tosses)

• = Pr( [H and T] or [T and H] )

= Pr( [H and T] ) + Pr( [T and H] )

= [1/2 1/2] + [1/2 1/2]

= 1/4 + 1/4 = 1/2.

The tree diagram

H

H

H T

T

T

{H,H}

{H,T}

{T,H}

{T,T}

P = 1/4

P = 1/4

P = 1/4

P = 1/4

½

½

½

½

½

½

P = ½

But it gets complicated quickly

• Pr(3 heads in 5 tosses)?

• Pr(30 heads in 50 tosses)?

• How many routes? Drawing takes too much time,

we need a formula.

The Binomial distribution

• If we define P as the probability of heads, and

hence (1 − P) is the probability of tails, we can

write:

Pr(1 head) = P1 (1 − P)1 2C1

• or, in general

Pr(r heads) = Pr (1 − P)(n−r) nCr.

• This is the formula for the Binomial distribution.

Example

• P(3 heads in 5 tosses):

n = 5, r = 3, P = ½

• Pr(3 heads) = Pr (1 − P)(n − r) nCr

= ½3 (1 − ½)2 5C3

= 1/8 1/4 5!/(3! 2!)

= 10/32.

Bayes’ Theorem

• A ball is drawn at random from one of the boxes

below. It is red.

• Intuitively, it seems more likely to have come

from Box A. But what is the precise probability?

Bayes’ theorem answers this question.

Box A Box B

Solution

• We require Pr(A|R). This can be written:

• Expanding top and bottom we have:

• We now have the answer in terms of probabilities

we can evaluate.

)Pr(

)andPr()|Pr(

R

RARA

)Pr()|Pr()Pr()|Pr(

)Pr()|Pr()|Pr(

BBRAAR

AARRA

• Hence we obtain:

• There is a 2/3 probability that the ball was taken

from Box A.

• A similar calculation yields Pr(B|R) = 1/3.

• These are the posterior probabilities. The prior

probabilities were ½, ½.

3

2

5.010/35.010/6

5.010/6)|Pr(

RA

Solution (Continued)

Prior and posterior probabilities

• Prior probabilities: Pr(A), Pr(B)

• Likelihoods: Pr(R|A), Pr(R|B)

• Posterior probabilities: Pr(A|R), Pr(B|R)

likelihood priorprobability

posteriorprobability = likelihood priorprobability

• The posterior probabilities are calculated as 2/3

and 1/3, as before.

Table of likelihoods and probabilities

Prior probabilities Likelihoods Prior × likelihood Posterior probabilities

A 0.5 0.6 0.30 0.30/0.45 2/3

B 0.5 0.3 0.15 0.15/0.45 1/3

Total 0.45

Summary

• Probability underlies statistical inference.

• There are rules (e.g. the multiplication rule) for

calculating probabilities.

• Independence simplifies the rules.

• These rules lead on to probability distributions

such as the Binomial.

• Bayes’ theorem tells us how to update

probabilities in the light of evidence.

Hypothesis testing

• Hypothesis testing is about making decisions.

• Is a hypothesis true or false?

• e.g. are women paid less, on average, than men?

Principles of hypothesis testing

• The null hypothesis is initially presumed to be true.

• Evidence is gathered, to see if it is consistent with the

hypothesis.

• If it is, the null hypothesis continues to be considered

‘true’ (later evidence might change this).

• If not, the null is rejected in favour of the alternative

hypothesis.

Two possible types of error

• Decision making is never perfect and mistakes

can be made

– Type I error: rejecting the null when true

– Type II error: accepting the null when false.

Type I and Type II errors

True situation

Decision H0 true H0 false

Accept H0

Correct

decision Type II error

Reject H0 Type I error Correct

decision

Avoiding incorrect decisions

• We wish to avoid both Type I and II errors.

• We can alter the decision rule to do this.

• Unfortunately, reducing the chance of making a

Type I error generally means increasing the

chance of a Type II error.

• Hence a trade-off.

Diagram of the decision rule

H1 H0

Rejection region Non-rejection region

x

xf

Dx

Type II error Type I

error

How to make a decision

• Where do we place the decision line?

• Set the Type I error probability to a particular

value. By convention, this is 5%.

• This is known as the significance level of the test.

It is complementary to the confidence level of

estimation.

• 5% significance level 95% confidence level.

Example: How long do LEDs last?

• A manufacturer of LEDs claims its product lasts

at least 5,000 hours, on average.

• A sample of 50 LEDs is tested. The average time

before failure is 4,900 hours, with standard

deviation 500 hours.

• Should the manufacturer’s claim be accepted or

rejected?

The hypotheses to be tested

• H0: m = 5,000

H1: m < 5,000

• This is a one-tailed test, since the rejection region

occupies only one side of the distribution.

Should the null hypothesis be rejected?

• Is 4,900 far enough below 5,000?

• Is it more than 1.64 standard errors below 5,000?

(1.64 standard errors below the mean cuts off the

bottom 5% of the Normal distribution).

79.180500

000,5900,4

22

ns

xz

m

• 4,900 is 1.79 standard errors below 5,000, so falls into the

rejection region (bottom 5% of the distribution)

• Hence, we can reject H0 at the 5% significance level or,

equivalently, with 95% confidence.

• If the true mean were 5,000, there is less than a 5%

chance of obtaining sample evidence such as

from a sample of n = 80.

900,4x

Should the null hypothesis be rejected? (Continued)

Formal layout of a problem

1. H0: m = 5,000

H1: m < 5,000

2. Choose significance level: 5%

3. Look up critical value: z* = 1.64

4. Calculate the test statistic: z = −1.79

5. Decision: reject H0 since −1.79 < −1.64 and falls

into the rejection region.

One- versus two-tailed tests

• Should you use a one-tailed (H1: m < 5,000) or two-tailed (H1: m 5,000) test?

• If you are only concerned about falling one side of the hypothesised value (as here: we would not worry if LEDs lasted longer than 5,000 hours) use the one-tailed test. You would not want to reject H0 if the sample mean were anywhere above 5,000.

• If for another reason, you know one side is impossible (e.g. demand curves cannot slope upwards), use a one-tailed test.

• Otherwise, use a two-tailed test.

• If unsure, choose a two-tailed test.

• Never choose between a one- or two-tailed test on

the basis of the sample evidence (i.e. do not

choose a one-tailed test because you notice that

4,900 < 5,000).

• The hypothesis should be chosen before looking

at the evidence!

One- versus two-tailed tests (Continued)

Two-tailed test example

• It is claimed that an average child spends 15

hours per week watching television. A survey of

100 children finds an average of 14.5 hours per

week, with standard deviation 8 hours. Is the

claim justified?

• The claim would be wrong if children spend either

more or less than 15 hours watching TV. The

rejection region is split across the two tails of the

distribution. This is a two-tailed test.

A two-tailed test – diagram

Reject H0 Reject H0

H1H1 H0

x

xf

2.5% 2.5%

Solution to the problem

1. H0: m = 15 H1: m 15

2. Choose significance level: 5%

3. Look up critical value: z* = 1.96

4. Calculate the test statistic:

5. Decision: we do not reject H0 since 0.625 < 1.96 and does not fall into the rejection region.

625.01008

155.14

22

ns

xz

m

The choice of significance level

• Why 5%?

• Like its complement, the 95% confidence level, it

is a convention. A different value can be chosen,

but it does set a benchmark.

• If the cost of making a Type I error is especially

high, then set a lower significance level, e.g. 1%.

The significance level is the probability of making

a Type I error.

The prob-value approach

• An alternative way of making the decision

• Returning to the LED problem, the test statistic z

= −1.79 cuts off 3.67% in the lower tail of the

distribution. 3.67% is the prob-value for this

example

• Since 3.67% < 5% the test statistic must fall into

the rejection region for the test.

Two ways to rejection

Reject H0 if either

• z < −z* (−1.79 < −1.64)

or

• the prob-value < the significance level

(3.67% < 5%).

Testing a proportion

• Same principles: reject H0 if the test statistic falls

into the rejection region.

• To test H0: = 0.5 versus H1: 0.5 (e.g. a coin

is fair or not) the test statistic is

n

p

n

pz

5.015.0

5.0

1

• If the sample evidence were 60 heads from 100

tosses (p = 0.6) we would have

• so we would (just) reject H0 since 2 > 1.96.

2

100

5.015.0

5.06.0

z

Testing a proportion (Continued)

Testing the difference of two means

• To test whether two samples are drawn from

populations with the same mean

• H0: m1 = m2 or H0: m1 − m2 = 0

H1: m1 m2 or H1: m1 − m2 0

• The test statistic is

2

2

2

1

2

1

2121

n

s

n

s

xxz

mm

Testing the difference of two proportions

• To test whether two sample proportions are equal

• H0: 1 = 2 or H0: 1 − 2 = 0

H1: 1 2 or H1: 1 − 2 0


21

2121

ˆ1ˆˆ1ˆ

nn

ppz

21

2211ˆnn

pnpn

Small samples (n < 25)

• Two consequences:

– the t distribution is used instead of the standard

normal for tests of the mean

– tests of proportions cannot be done by the standard

methods used in the book.

12

~

nt

ns

xt

m

Testing a mean

• A sample of 12 cars of a particular make average

35 mpg, with standard deviation 15. Test the

manufacturer’s claim of 40 mpg as the true

average.

• H0: m = 40

H1: m < 40.


• The critical value of the t distribution (df = 11, 5%

significance level, one tail) is t* = 1.796.

• Hence we cannot reject the manufacturer’s claim.

15.11215

4035

2

t

Testing a mean (Continued)

Testing the difference of two means


• where S is the pooled variance

2

2

1

2

2111

n

S

n

S

xxt

mm

2

11

21

2

22

2

112

nn

snsnS

2

Summary

• The principles are the same for all tests: calculate

the test statistic and see if it falls into the rejection

region.

• The formula for the test statistic depends upon

the problem (mean, proportion, etc).

• The rejection region varies, depending upon

whether it is a one or two-tailed test.

Documents

Quantitative Methods for Finance - Lecture 2