Estimators

8/9/2019 Estimators

1/30

8/9/2019 Estimators

2/30

An Estimator is the statistical function of the

observable sample data that is used to

estimate an unknown parameter (which is

called as Estimand). The result from the application of the

function to a particular sample of data is

called as Estimate .

It is possible to construct many estimatorsfor a given parameter.

The performance of an estimator may be

evaluated by using Loss functions.

8/9/2019 Estimators

3/30

To estimate a parameter i.e. a population mean

etc.. The usual procedure is as follows :

Select a random sample from the population of

interest .Calculate the Point EstimatePoint Estimate of the parameter .

Calculate a measure of variability, often a

confidence interval .

Associate with this estimate a measure of

variability .

8/9/2019 Estimators

4/30

This Point estimation makes use of sample

data to calculate a single value which is to

serve as a best guessbest guess for an unknown

parameter .

Point estimation should be contrasted with the

general Bayesian methodsBayesian methods of estimation where

the goal is usually to compute the posteriorposterior

distributions of parameters and other

quantities of interest.

The contrast here is between estimating asingle point versus estimating a weighted set

of points (PDF).

8/9/2019 Estimators

5/30

Uses aspects of the scientific method, which

involves in collecting evidence that is meant

to be consistent or inconsistent with a given

hypothesis. As evidence accumulates, thedegree of belief in a hypothesis ought to

change.

Hypothesis with very high support should be

accepted as true and those with low supportis to be rejected .

8/9/2019 Estimators

6/30

Maximum Likelihood (ML)Maximum Likelihood (ML)

Method of MomentsMethod of Moments

Minimum mean squared error (MMSE)Minimum mean squared error (MMSE)

Minimum variance unbiased estimator(MVUE)Minimum variance unbiased estimator(MVUE)

Best Linear unbiased estimator.Best Linear unbiased estimator.

Now we discuss each of them in detailNow we discuss each of them in detail

8/9/2019 Estimators

7/30

This is a popular statistical model used for fitting

a statistical model to data ,and providingestimates for the models parameters .

For example, suppose you are interested in the

heights of Americans. You have a sample of some

number of Americans, but not the entire

population, and record their heights.

Further, you are willing to assume that heights

are normally distributed with some

unknown mean and variance . The sample mean

is then the maximum likelihood estimator of thepopulation mean, and the sample variance is a

close approximation to the maximum likelihood

estimator of the population variance

8/9/2019 Estimators

8/30

For a fixed set of data and underlyingprobability model, maximum likelihood picks

the values of the model parameters that

make the data "more likely" than any other .

Maximum likelihood estimation gives a

unique and easy way to determine solution in

the case of the normal distribution and many

other problems, although in very complex

problems this may not be the case.

If a uniform prior distribution is assumed

over the parameters, the maximum

likelihood estimate coincides with the most

probable values thereof.

8/9/2019 Estimators

9/30

This is a method of estimation of populationparameters such as mean, variance, median

etc by equating sample moments with

unobservable population moments and then

solving those equations for the quantities to

be estimated.

Estimates by the method of moments may be

used as the first approximation to the

solutions of the likelihood equations, and

successive improved approximations maythen be found by the NewtonRaphson

method .

8/9/2019 Estimators

10/30

In some cases, infrequent with large samples

but not so infrequent with small samples, the

estimates given by the method of moments

are outside of the parameter space; it does

not make sense to rely on them then.

Also, estimates by the method of momentsare not necessarily sufficient statistics, i.e.,

they sometimes fail to take into account all

relevant information in the sample.

8/9/2019 Estimators

11/30

MSE of an estimator is one of many ways toquantify the difference between anestimator and the true value of the quantitybeing estimated.

The MSE is the second moment (about the

origin) of the error, and thus incorporatesboth the variance of the estimator and itsbias. For an unbiased estimator, the MSE isthe variance.

In an analogy to standard deviation, takingthe square root of MSE yields the root meansquared error or RMSE

For an unbiased estimator RMSE is called asstandard error.

8/9/2019 Estimators

12/30

Since MSE is an expectation, it is a scalar and

not a random variable. It may be a function

of a unknown parameter , but it does not

depend on any random quantities.

8/9/2019 Estimators

13/30

8/9/2019 Estimators

14/30

An MSE of zero, meaning that the

estimator predicts observations of the

parameter with perfect accuracy, is the

ideal and forms the basis for the leastleast

squares methodsquares method of regression analysis.

While particular values of MSE other thanzero are meaningless in and of themselves,

they may be used for comparative purposes.

The unbiased model with the smallest MSE is

generally interpreted as best explaining thevariability in the observations.

8/9/2019 Estimators

15/30

Minimizing MSE is a key criterion in selectionestimators. Among unbiased estimators, theminimal MSE is equivalent to minimizing thevariance, and is obtained by the MVUEMVUE(Minimum variance unbiased estimator)(Minimum variance unbiased estimator) .

Like variance, mean squared error has thedisadvantage of heavily weighting outliersweighting outliers.This is a result of the squaring of each term,which effectively weights large errors moreheavily than small ones. This property,

undesirable in many applications, has ledresearchers to use alternatives such asthe mean absolute errormean absolute error, or those based onthe median.

8/9/2019 Estimators

16/30

In statistics, the mean absolute error is aquantity used to measure how close forecasts orpredictions are to the eventual outcomes. Themean absolute error (MAE) is given by

As the name suggests, the mean absolute error isan average of the absolute errors ei = fi yi,where fi is the prediction and yi the true value.

The MAE and the RMSE can be used together todiagnose the variation in the errors in a set of

forecasts. The RMSE will always be larger orequal to the MAE; the greater differencebetween them, the greater the variance in theindividual errors in the sample. If the RMSE=MAE,then all the errors are of the same magnitude

8/9/2019 Estimators

17/30

In statistics and signal processing ,a MMSEEstimator describes the approach which

minimizes the mean square errormean square error, which is a

common measure of estimator quality.

Let X be an unknown random variable, and Ybe a known random variable(measurement).

An estimator X^(y) is any function of the

measurement Y and its MSE is given by

MSE=E{(X^-X)2 }

The MMSE estimator is defined as the

estimator achieving minimal MSE.

8/9/2019 Estimators

18/30

It has lower variance than any other

unbiased estimator for all possible values of

the parameter.

An efficient estimator need not exist, but ifit does, it's the MVUE because MSE is the sum

of variance and bias of an estimator.

The MVUE minimizes MSE among unbiased

estimators. In some cases biased estimatorshave lower MSE because they have a smaller

variance than does any unbiased estimator.

8/9/2019 Estimators

19/30

It frequently occurs that the MVU estimator,even if it exists, cannot be found. Forexample, if PDF is not known, theory ofsufficient statistics cannot be applied. Also,if PDF is known, it doesnt make ensure

minimum variance. In such cases, we have to resort to a

suboptimal estimator approach. We canrestrict the estimator to a linear form that is

unbiased. It should also have minimumvariance.

An example of this approach is the BestLinear Unbiased Estimator (BLUE) approach

8/9/2019 Estimators

20/30

This a linear model in which the errors have

expectation zero and are uncorrelated and

have equal variances.

8/9/2019 Estimators

21/30

Maximum A Posteriori (MAP)

Wiener filter

Kalman filter

Particle filter

Markov chain Monte Carlo (MCMC)

8/9/2019 Estimators

22/30

Sometimes we have priori information about

PDF of the parameter to be estimated. Let be the RV and the associated

probabilities are called priori probabilities.

Bayes theorem shows the way for

incorporating prior information in theestimation process

The term on the left hand side is called

posterior ,numerator is product of likelihood

term and the prior term, denominator serves

as a normalization term so that a posterior

PDF sums to unity.

8/9/2019 Estimators

23/30

In Bayesian statistics MAP is a mode of the

posterior distribution.

The Bayesian inference produces a maximum

a posterior (MAP) estimate ..

8/9/2019 Estimators

24/30

Wiener filter reduces the amount of noise

present in a signal by comparison with anestimation of the desired noiseless signal.

Since ,the filter assumes the inputs as stationarythis filter is not an adaptive filter.

Wiener filters are characterized by thefollowing:

Assumption: signal and (additive) noise arestationary linear stochastic processes with knownspectral characteristics or known autocorrelationand cross-correlation

Requirement: the filter must be physicallyrealizable, i.e. causal .

Performance criterion: minimum mean-squareerror(MMSE)

8/9/2019 Estimators

25/30

The input to the filter is assumed to be a

signal, s(t) corrupted by additive noise,n(t).The output S^(t) is calculated by means

of a filter g(t) using the following

convolution S^(t) =g(t)*(s(t)+n(t))

whereg(t) is the wiener filters impulse response

The error is defined as e(t)=s(t+ )- S^(t)

where is the delay of the wiener filter (

since it is casual) . In other words, the error is the difference

between the estimated signal and the true

signal shifted by .

8/9/2019 Estimators

26/30

Clearly the squared error is given by

e2(t)=s2(t+)-2s(t+ )s^(t)+s^2(t)where s(t+ ) is the desired output of thefilter e(t) is the error .

Depending on the value of the problem

name can be changed. If >0 then the problem is that of predictionprediction

(error is reduced when s^(t) is similar to alater value of s) .

If =0 then the problem is that of filtering(error is reduced when s^(t) is similar to s(t).

If

8/9/2019 Estimators

27/30

The Wiener filter problem has solutions for

three possible cases:One where a non-causal filter is acceptable

(requiring an infinite amount of both pastand future data) .

The case where a causal filter is desired(using an infinite amount of past data), and

The FIR case where a finite amount of pastdata is used.

The first case is simple to solve but is notsuited for real-time applications.

Wiener's main accomplishment was solvingthe case where the causality requirement isin effect

8/9/2019 Estimators

28/30

A major limitation towards more widespread

implementation of Bayesian approaches is

that obtaining the posterior distribution

often requires the integration of high-dimensional functions.

This can be computationally very difficult in

times.

MCMC approaches are so named because oneuses the previous sample values to randomly

generate the next sample value , thus

generating a Markov chain.

8/9/2019 Estimators

29/30

8/9/2019 Estimators

30/30

Documents

Estimators