Estimators

Embed Size (px)

Citation preview

  • 8/9/2019 Estimators

    1/30

  • 8/9/2019 Estimators

    2/30

    An Estimator is the statistical function of the

    observable sample data that is used to

    estimate an unknown parameter (which is

    called as Estimand). The result from the application of the

    function to a particular sample of data is

    called as Estimate .

    It is possible to construct many estimatorsfor a given parameter.

    The performance of an estimator may be

    evaluated by using Loss functions.

  • 8/9/2019 Estimators

    3/30

    To estimate a parameter i.e. a population mean

    etc.. The usual procedure is as follows :

    Select a random sample from the population of

    interest .Calculate the Point EstimatePoint Estimate of the parameter .

    Calculate a measure of variability, often a

    confidence interval .

    Associate with this estimate a measure of

    variability .

  • 8/9/2019 Estimators

    4/30

    This Point estimation makes use of sample

    data to calculate a single value which is to

    serve as a best guessbest guess for an unknown

    parameter .

    Point estimation should be contrasted with the

    general Bayesian methodsBayesian methods of estimation where

    the goal is usually to compute the posteriorposterior

    distributions of parameters and other

    quantities of interest.

    The contrast here is between estimating asingle point versus estimating a weighted set

    of points (PDF).

  • 8/9/2019 Estimators

    5/30

    Uses aspects of the scientific method, which

    involves in collecting evidence that is meant

    to be consistent or inconsistent with a given

    hypothesis. As evidence accumulates, thedegree of belief in a hypothesis ought to

    change.

    Hypothesis with very high support should be

    accepted as true and those with low supportis to be rejected .

  • 8/9/2019 Estimators

    6/30

    Maximum Likelihood (ML)Maximum Likelihood (ML)

    Method of MomentsMethod of Moments

    Minimum mean squared error (MMSE)Minimum mean squared error (MMSE)

    Minimum variance unbiased estimator(MVUE)Minimum variance unbiased estimator(MVUE)

    Best Linear unbiased estimator.Best Linear unbiased estimator.

    Now we discuss each of them in detailNow we discuss each of them in detail

  • 8/9/2019 Estimators

    7/30

    This is a popular statistical model used for fitting

    a statistical model to data ,and providingestimates for the models parameters .

    For example, suppose you are interested in the

    heights of Americans. You have a sample of some

    number of Americans, but not the entire

    population, and record their heights.

    Further, you are willing to assume that heights

    are normally distributed with some

    unknown mean and variance . The sample mean

    is then the maximum likelihood estimator of thepopulation mean, and the sample variance is a

    close approximation to the maximum likelihood

    estimator of the population variance

  • 8/9/2019 Estimators

    8/30

    For a fixed set of data and underlyingprobability model, maximum likelihood picks

    the values of the model parameters that

    make the data "more likely" than any other .

    Maximum likelihood estimation gives a

    unique and easy way to determine solution in

    the case of the normal distribution and many

    other problems, although in very complex

    problems this may not be the case.

    If a uniform prior distribution is assumed

    over the parameters, the maximum

    likelihood estimate coincides with the most

    probable values thereof.

  • 8/9/2019 Estimators

    9/30

    This is a method of estimation of populationparameters such as mean, variance, median

    etc by equating sample moments with

    unobservable population moments and then

    solving those equations for the quantities to

    be estimated.

    Estimates by the method of moments may be

    used as the first approximation to the

    solutions of the likelihood equations, and

    successive improved approximations maythen be found by the NewtonRaphson

    method .

  • 8/9/2019 Estimators

    10/30

    In some cases, infrequent with large samples

    but not so infrequent with small samples, the

    estimates given by the method of moments

    are outside of the parameter space; it does

    not make sense to rely on them then.

    Also, estimates by the method of momentsare not necessarily sufficient statistics, i.e.,

    they sometimes fail to take into account all

    relevant information in the sample.

  • 8/9/2019 Estimators

    11/30

    MSE of an estimator is one of many ways toquantify the difference between anestimator and the true value of the quantitybeing estimated.

    The MSE is the second moment (about the

    origin) of the error, and thus incorporatesboth the variance of the estimator and itsbias. For an unbiased estimator, the MSE isthe variance.

    In an analogy to standard deviation, takingthe square root of MSE yields the root meansquared error or RMSE

    For an unbiased estimator RMSE is called asstandard error.

  • 8/9/2019 Estimators

    12/30

    Since MSE is an expectation, it is a scalar and

    not a random variable. It may be a function

    of a unknown parameter , but it does not

    depend on any random quantities.

  • 8/9/2019 Estimators

    13/30

  • 8/9/2019 Estimators

    14/30

    An MSE of zero, meaning that the

    estimator predicts observations of the

    parameter with perfect accuracy, is the

    ideal and forms the basis for the leastleast

    squares methodsquares method of regression analysis.

    While particular values of MSE other thanzero are meaningless in and of themselves,

    they may be used for comparative purposes.

    The unbiased model with the smallest MSE is

    generally interpreted as best explaining thevariability in the observations.

  • 8/9/2019 Estimators

    15/30

    Minimizing MSE is a key criterion in selectionestimators. Among unbiased estimators, theminimal MSE is equivalent to minimizing thevariance, and is obtained by the MVUEMVUE(Minimum variance unbiased estimator)(Minimum variance unbiased estimator) .

    Like variance, mean squared error has thedisadvantage of heavily weighting outliersweighting outliers.This is a result of the squaring of each term,which effectively weights large errors moreheavily than small ones. This property,

    undesirable in many applications, has ledresearchers to use alternatives such asthe mean absolute errormean absolute error, or those based onthe median.

  • 8/9/2019 Estimators

    16/30

    In statistics, the mean absolute error is aquantity used to measure how close forecasts orpredictions are to the eventual outcomes. Themean absolute error (MAE) is given by

    As the name suggests, the mean absolute error isan average of the absolute errors ei = fi yi,where fi is the prediction and yi the true value.

    The MAE and the RMSE can be used together todiagnose the variation in the errors in a set of

    forecasts. The RMSE will always be larger orequal to the MAE; the greater differencebetween them, the greater the variance in theindividual errors in the sample. If the RMSE=MAE,then all the errors are of the same magnitude

  • 8/9/2019 Estimators

    17/30

    In statistics and signal processing ,a MMSEEstimator describes the approach which

    minimizes the mean square errormean square error, which is a

    common measure of estimator quality.

    Let X be an unknown random variable, and Ybe a known random variable(measurement).

    An estimator X^(y) is any function of the

    measurement Y and its MSE is given by

    MSE=E{(X^-X)2 }

    The MMSE estimator is defined as the

    estimator achieving minimal MSE.

  • 8/9/2019 Estimators

    18/30

    It has lower variance than any other

    unbiased estimator for all possible values of

    the parameter.

    An efficient estimator need not exist, but ifit does, it's the MVUE because MSE is the sum

    of variance and bias of an estimator.

    The MVUE minimizes MSE among unbiased

    estimators. In some cases biased estimatorshave lower MSE because they have a smaller

    variance than does any unbiased estimator.

  • 8/9/2019 Estimators

    19/30

    It frequently occurs that the MVU estimator,even if it exists, cannot be found. Forexample, if PDF is not known, theory ofsufficient statistics cannot be applied. Also,if PDF is known, it doesnt make ensure

    minimum variance. In such cases, we have to resort to a

    suboptimal estimator approach. We canrestrict the estimator to a linear form that is

    unbiased. It should also have minimumvariance.

    An example of this approach is the BestLinear Unbiased Estimator (BLUE) approach

  • 8/9/2019 Estimators

    20/30

    This a linear model in which the errors have

    expectation zero and are uncorrelated and

    have equal variances.

  • 8/9/2019 Estimators

    21/30

    Maximum A Posteriori (MAP)

    Wiener filter

    Kalman filter

    Particle filter

    Markov chain Monte Carlo (MCMC)

  • 8/9/2019 Estimators

    22/30

    Sometimes we have priori information about

    PDF of the parameter to be estimated. Let be the RV and the associated

    probabilities are called priori probabilities.

    Bayes theorem shows the way for

    incorporating prior information in theestimation process

    The term on the left hand side is called

    posterior ,numerator is product of likelihood

    term and the prior term, denominator serves

    as a normalization term so that a posterior

    PDF sums to unity.

  • 8/9/2019 Estimators

    23/30

    In Bayesian statistics MAP is a mode of the

    posterior distribution.

    The Bayesian inference produces a maximum

    a posterior (MAP) estimate ..

  • 8/9/2019 Estimators

    24/30

    Wiener filter reduces the amount of noise

    present in a signal by comparison with anestimation of the desired noiseless signal.

    Since ,the filter assumes the inputs as stationarythis filter is not an adaptive filter.

    Wiener filters are characterized by thefollowing:

    Assumption: signal and (additive) noise arestationary linear stochastic processes with knownspectral characteristics or known autocorrelationand cross-correlation

    Requirement: the filter must be physicallyrealizable, i.e. causal .

    Performance criterion: minimum mean-squareerror(MMSE)

  • 8/9/2019 Estimators

    25/30

    The input to the filter is assumed to be a

    signal, s(t) corrupted by additive noise,n(t).The output S^(t) is calculated by means

    of a filter g(t) using the following

    convolution S^(t) =g(t)*(s(t)+n(t))

    whereg(t) is the wiener filters impulse response

    The error is defined as e(t)=s(t+ )- S^(t)

    where is the delay of the wiener filter (

    since it is casual) . In other words, the error is the difference

    between the estimated signal and the true

    signal shifted by .

  • 8/9/2019 Estimators

    26/30

    Clearly the squared error is given by

    e2(t)=s2(t+)-2s(t+ )s^(t)+s^2(t)where s(t+ ) is the desired output of thefilter e(t) is the error .

    Depending on the value of the problem

    name can be changed. If >0 then the problem is that of predictionprediction

    (error is reduced when s^(t) is similar to alater value of s) .

    If =0 then the problem is that of filtering(error is reduced when s^(t) is similar to s(t).

    If

  • 8/9/2019 Estimators

    27/30

    The Wiener filter problem has solutions for

    three possible cases:One where a non-causal filter is acceptable

    (requiring an infinite amount of both pastand future data) .

    The case where a causal filter is desired(using an infinite amount of past data), and

    The FIR case where a finite amount of pastdata is used.

    The first case is simple to solve but is notsuited for real-time applications.

    Wiener's main accomplishment was solvingthe case where the causality requirement isin effect

  • 8/9/2019 Estimators

    28/30

    A major limitation towards more widespread

    implementation of Bayesian approaches is

    that obtaining the posterior distribution

    often requires the integration of high-dimensional functions.

    This can be computationally very difficult in

    times.

    MCMC approaches are so named because oneuses the previous sample values to randomly

    generate the next sample value , thus

    generating a Markov chain.

  • 8/9/2019 Estimators

    29/30

  • 8/9/2019 Estimators

    30/30