25
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation

PROBABILITY AND STATISTICS FOR ENGINEERING

  • Upload
    ull

  • View
    34

  • Download
    0

Embed Size (px)

DESCRIPTION

PROBABILITY AND STATISTICS FOR ENGINEERING. Hossein Sameti Department of Computer Engineering Sharif University of Technology. Principles of Parameter Estimation . The Estimation Problem. - PowerPoint PPT Presentation

Citation preview

Page 1: PROBABILITY AND STATISTICS FOR ENGINEERING

PROBABILITY AND STATISTICS FOR ENGINEERING

Hossein Sameti

Department of Computer EngineeringSharif University of Technology

Principles of Parameter Estimation

Page 2: PROBABILITY AND STATISTICS FOR ENGINEERING

The Estimation Problem

We use the various concepts introduced and studied in earlier lectures to solve practical problems of interest.

Consider the problem of estimating an unknown parameter of interest from a few of its noisy observations. - the daily temperature in a city- the depth of a river at a particular spot

Observations (measurement) are made on data that contain the desired nonrandom parameter and undesired noise.

Page 3: PROBABILITY AND STATISTICS FOR ENGINEERING

The Estimation Problem

For example

or, the i th observation can be represented as

: the unknown nonrandom desired parameter : random variables that may be dependent or

independent from observation to observation.

The Estimation Problem: - Given n observations obtain the “best” estimator

for the unknown parameter in terms of these observations.

noise, part) (desired signal nObservatio

.,,2,1 , ninX ii

, , , , 2211 nn xXxXxX

nini ,,2,1 ,

Page 4: PROBABILITY AND STATISTICS FOR ENGINEERING

Estimators Let us denote by the estimator for . Obviously is a function of only the observations. “Best estimator” in what sense?

Ideal solution: the estimate coincides with the unknown . Almost always any estimate will result in an error given by

One strategy would be to select the estimator so as to minimize some function of this error - mean square error (MMSE), - absolute value of the error- etc.

.)(ˆ Xe

)(ˆ X

)(ˆ X

)(ˆ X

)(ˆ X

Page 5: PROBABILITY AND STATISTICS FOR ENGINEERING

A More Fundamental Approach: Principle of Maximum Likelihood

Underlying Assumption: the available data has something to do with the unknown parameter .

We assume that the joint p.d.f of , depends on .

This method - assumes that the given sample data set is representative of the population

- chooses the value for that most likely caused the observed data to occur

nXXX , ,, 21

nXXX , ,, 21 ),; , ,,( 21 nX xxxf

); , ,,( 21 nX xxxf

Page 6: PROBABILITY AND STATISTICS FOR ENGINEERING

Principle of Maximum Likelihood

In other words, given the observations , is a function of alone

The value of that maximizes the above p.d.f is the most likely value for , and it is chosen as the ML estimate for .

nxxx , ,, 21 ); , ,,( 21 nX xxxf

)(ˆ XML

)(ˆ XML

); , ,,( 21 nX xxxf

Page 7: PROBABILITY AND STATISTICS FOR ENGINEERING

Given the joint p.d.f represents the likelihood function

The ML estimate can be determined either from- the likelihood equation

- or using the log-likelihood function

If is differentiable and a supremum exists in the above equation, then that must satisfy the equation

, , , , 2211 nn xXxXxX ); , ,,( 21 nX xxxf

); , ,,( sup 21ˆ

nX xxxfML

); , ,,( 21 nxxxL

).; , ,,(log); , ,,( 2121 nXn xxxfxxxL

ML

.0); , ,,(logˆ

21

ML

nX xxxf

Page 8: PROBABILITY AND STATISTICS FOR ENGINEERING

Let represent n observations where is the unknown parameter of interest, are zero mean independent normal r.vs with common

variance

Determine the ML estimate for .

Since s are independent r.vs and is an unknown constant, s are independent normal random variables.

Thus the likelihood function takes the form

,1 , niwX ii

.2,1 , niwi

iXiw

.);(); , ,,(1

21

n

iiXnX xfxxxf

i

Example

Solution

Page 9: PROBABILITY AND STATISTICS FOR ENGINEERING

Each is Gaussian with mean and variance (Why?).

Thus

Therefore the likelihood function is:

It is easier to work with the log-likelihood function in this case.

Example - continued

.2

1);(22 2/)(

2

i

i

xiX exf

2iX

.)2(

1);,,,( 1

22 2/)(

2/221

n

iix

nnX exxxf

);( XL

Page 10: PROBABILITY AND STATISTICS FOR ENGINEERING

We obtain

and taking derivative with respect to , we get

or

This linear estimator represents the ML estimate for .

,02

)(2); , ,,(lnˆ1

21

MLML

n

i

inX xxxxf

.1)(ˆ1

n

iiML X

nX

,2

)()2ln(2

);,,,(ln);(1

2

22

21

n

i

inX

xnxxxfXL

Example - continued

Page 11: PROBABILITY AND STATISTICS FOR ENGINEERING

Unbiased Estimators

Notice that the estimator is a r.v. Taking its expected value, we get

i.e., the expected value of the estimator does not differ from the desired parameter, and hence there is no bias between the two.

Such estimators are known as unbiased estimators.

represents an unbiased estimator for .

,)(1)](ˆ[1

n

iiML XE

nxE

n

iiML X

nX

1

1)(

Page 12: PROBABILITY AND STATISTICS FOR ENGINEERING

Consistent Estimators

Moreover the variance of the estimator is given by

The latter terms are zeros since and are independent r.vs. So,

And:

another desired property. We say estimators that satisfy this limit are consistent estimators.

22

1

2 2

21 1

22

1 1 1,

1ˆ ˆ( ) [( ) ]

1 1( ) ( )

1 ( ) ( )( ) .

n

ML ML ii

n n

i ii i

n n n

i i ji i j i j

Var E E Xn

E X E Xn n

E X E X Xn

iX jX

.)(1)ˆ(2

2

2

12 nn

nXVarn

Varn

iiML

, as 0)ˆ( nVar ML

Page 13: PROBABILITY AND STATISTICS FOR ENGINEERING

Let be i.i.d. uniform random variables in with common p.d.f

where is an unknown parameter. Find the ML estimate for .

The likelihood function in this case is given by

The likelihood function here is maximized by the minimum value of .

nXXX , ,, 21 ),0(

,0 ,1);(

iiX xxfi

Example

Solution

.) , ,,max(0 ,1

1 ,0 ,1); , ,,(

21

2211

nn

innnX

xxx

nixxXxXxXf

Page 14: PROBABILITY AND STATISTICS FOR ENGINEERING

and since we get

to be the ML estimate for .

a nonlinear function of the observations.

Is this is an unbiased estimate for ? we need to evaluate its mean. It is easier to determine its p.d.f and proceed directly.

Let where

), , ,,max( 21 nXXX

) , ,,max()(ˆ21 nML XXXX

Example - continued

.0 ,1);(

iiX xxfi

) , ,,max( 21 nXXXZ

Page 15: PROBABILITY AND STATISTICS FOR ENGINEERING

Then

so that

Using the above, we get

,0 ,)()(

),,,( ]) , ,,[max()(

11

21

21

zzzFzXP

zXzXzXPzXXXPzF

nn

iX

n

ii

n

nZ

i

.otherwise,0

,0 ,)(1

znzzf n

n

Z

. )/11(1

)( )()](ˆ[1

0

0 nnndzzndzzfzZEXE n

nn

nZML

Example - continued

Page 16: PROBABILITY AND STATISTICS FOR ENGINEERING

In this case so the ML estimator is not an unbiased estimator for . However, note that as

i.e., the ML estimator is an asymptotically unbiased estimator. Also,

so that

as implying that this estimator is a consistent estimator.

,)/11(

lim)](ˆ[lim

n

XEnMLn

2)()(

2

0

1

0

22

nndzzndzzfzZE n

nZ

.)2()1()1(2

)]([)()](ˆ[ 2

2

2

22222

nnn

nn

nnZEZEXVar ML

0)](ˆ[ XVar ML ,n

Example - continued

,)](ˆ[ XE ML

n

Page 17: PROBABILITY AND STATISTICS FOR ENGINEERING

Let be i.i.d Gamma random variables with unknown parameters and .

Determine the ML estimator for and .

Here and

This gives the log-likelihood function to be

Example

Solution

nXXX , ,, 21

,0ix

. ))((

),; , ,,(1

121

1

n

i

x

in

n

nX

n

ii

exxxxf

.log)1()(loglog

),; , ,,(log),; , ,,(

11

2121

n

ii

n

ii

nXn

xxnn

xxxfxxxL

Page 18: PROBABILITY AND STATISTICS FOR ENGINEERING

Differentiating L with respect to and we get

Thus,

So,

Notice that this is highly nonlinear in

,0log)()(

logˆ,ˆ,1

n

iixnnL

.0ˆ,ˆ,1

n

iixnL

,1

ˆ)(ˆ

1

n

ii

MLML

xn

X

Example - continued

.11log)ˆ()ˆ(ˆlog

11

n

ii

n

ii

ML

MLML x

nx

n

.ˆML

Page 19: PROBABILITY AND STATISTICS FOR ENGINEERING

Conclusion

In general the (log)-likelihood function - can have more than one solution, or no solutions at all. - may not be even differentiable- can be extremely complicated to solve explicitly

Page 20: PROBABILITY AND STATISTICS FOR ENGINEERING

Best Unbiased Estimator

We have seen that represents an unbiased estimator

for with variance

It is possible that, for a given n, there may be other unbiased estimators to this problem with even lower variances. If such is indeed the case, those estimators will be naturally preferrable

compared to previous one. Is it possible to determine the lowest possible value for the variance of

any unbiased estimator?

A theorem by Cramer and Rao gives a complete answer to this problem.

n

iiML X

nX

1

1)(

.2

n

Page 21: PROBABILITY AND STATISTICS FOR ENGINEERING

Cramer - Rao Bound

Variance of any unbiased estimator based on observations

for must satisfy the lower bound

The right side of above equation acts as a lower bound on the variance of all unbiased estimator for , provided their joint p.d.f satisfies certain regularity restrictions. (see (8-79)-(8-81), Text).

nn xXxXxX , ,, 2211

.

);,,,(ln

1

);,,,(ln

1)ˆ(

221

2221

nXnX xxxfExxxfE

Var

Page 22: PROBABILITY AND STATISTICS FOR ENGINEERING

Efficient Estimators Any unbiased estimator whose variance coincides with Cramer-Rao

bound must be the best. Such estimates are known as efficient estimators. Let us examine whether is efficient .

and

So the Cramer - Rao lower bound is

;)(1

);,,,(ln2

14

221

n

ii

nX Xxxxf

,1

)])([(])[(1

);,,,(ln

21

24

1 ,11

24

221

n

XXEXExxxfE

n

i

n

i

n

jijji

n

ii

nX

n

iiML X

nX

1

1)(

.2

n

Page 23: PROBABILITY AND STATISTICS FOR ENGINEERING

Rao-Blackwell Theorem

As we obtained before, the variance of this ML estimator is the same as the specified bound.

If there are no unbiased estimators that are efficient, the best estimator will be an unbiased estimator with the lowest possible variance.

How does one find such an unbiased estimator? Rao-Blackwell theorem gives a complete answer to this problem.

Cramer-Rao bound can be extended to multiparameter case as well.

Page 24: PROBABILITY AND STATISTICS FOR ENGINEERING

Estimating Parameters with a-priori p.d.f

So far, we discussed nonrandom parameters that are unknown. What if the parameter of interest is a r.v with a-priori p.d.f

How does one obtain a good estimate for based on the observations

One technique is to use the observations to compute its a-posteriori p.d.f.

Of course, we can use the Bayes’ theorem to obtain this a-posteriori p.d.f.

Notice that this is only a function of , since represent given observations.

?)(f

? , ,, 2211 nn xXxXxX

). , ,,|( 21| nX xxxf

.) , ,,(

)()| , ,,(), ,,|(

21

21|21|

nX

nXnX xxxf

fxxxfxxxf

nxxx , ,, 21

Page 25: PROBABILITY AND STATISTICS FOR ENGINEERING

MAP Estimator

Once again, we can look for the most probable value of suggested by the above a-posteriori p.d.f.

Naturally, the most likely value for is the one corresponding to the maximum of the a-posteriori p.d.f (The MAP estimator for ).

It is possible to use other optimality criteria as well. MAP

) , ,,|( 21 nxxxf