Bayesian Methods of Estimation

8/2/2019 Bayesian Methods of Estimation

1/12

MAKERERE UNIVESITY

COLLEGE OF ENGINEERING DESIGN ART AND TECHNOLOGY

DEPARTMENT OF CIVIL ENGINEERING

Math assignment

Group members

NAMES REGISTRATION

NUMBER

STUDENT NUMBER

OLARA ALLAN 10/U/683 210001123

ARIKOD RICHARD 10/U/657 210001135

MUKIIZA JULIUS 10/U/671 210001151

NDARAMA MICHEAL

SIMON

10/X/3007/PSA 210004611

TWEHEYO DISHAN 10/U/690 210001016

BULUMA MELINDA 10/U/662 210000809

NAMIYA MARIAM 10/U/676 210000345

SSONKO EMMANUEL 10/U/9979/PSA 210005498

BUYINZA ABBEY 10/U/663 210000879

ARIKE PATRICK 10/U/9989/PSA 210018734

TUKAMUSHABA EMMY 08/U/3053/PSA 208006302

ANGURA GABRIEL 10/U/9946/PSA 210018907

SEMAHORO ALLAN 10/U/9945/PSA 210006460

NANKABIRWA ROSE 10/U/678 210000348

MUTONGOLE SAMUEL 10/U/9965/PSA 210009531

NAMPEERA ROBINAH 10/U/677 210001032

KIGONYA ALLAN 10/U/668 210000683

OCAN GEOFREY 10/U/9971/PSA 210017525

MUTYOGOMA MAUYA 10/U/9998/PS 210006589

DRATELE SIGFRIED BUDRA 10/U/1914 210001946

MUKIIBI SSEMAKULA

PETER

10/U/9964/PSA 210006993

KINENE SERWANGA BRIAN 10/U/687 210001541

OLUKA PATRICK 10/U/1002/PS 210006598


2/12

Group 5 math assignment Page 1

BAYESIAN ESTIMATION OF DISTRIBUTION PARAMETERS

IntroductionBayes theorem is a theorem with two distinct interpretations. In the Bayesian interpretation, it

expresses how a subjective degree of belief should rationally change to account for evidence. Infrequents interpretation, it relates inverse representations of the probabilities concerning two

events Bayesian statistics has applications in fields including science, engineering, medicine and

law.

Basics of the Bayesian estimation methodConsider the problem of finding a point estimate of the parameter for the population f(x: ).

The classical approach would be take a random sample of size n and substitute the information

provided by the sample into the appropriate estimator or decision function.

For the case of binomial population b(x:n,p) the estimate ofp, the proportion of success, wouldbe .Suppose that additional information is given about parameter , namely, that is known to vary

according to some to probability distribution f(), often called prior distribution, with prior

mean o and prior variance o2. That is we are now assuming to be a value of a randomvariable with probability distribution f() and we wish to estimate the particular value of for

the population from which we selected our sample.

The probabilities associated with this prior distribution are called subjective probabilities, in

that they measure a persons degree of beliefin the location of the parameter.

Bayesian techniques use the prior distribution f() along with joint distribution the sample to compute the posterior distribution The posterior distribution consists of information from both subjective-prior distribution and the

objective sample distribution and express the degree of belief in the location of parameter afteryou have observed the sample.

If we denote by the joint probability distribution of the sample,conditional on the parameter in a situation where is a random variable; The joint distribution of the sample and then the parameter is then = f()
http://en.wikipedia.org/wiki/Sciencehttp://en.wikipedia.org/wiki/Engineeringhttp://en.wikipedia.org/wiki/Medicinehttp://en.wikipedia.org/wiki/Lawhttp://en.wikipedia.org/wiki/Lawhttp://en.wikipedia.org/wiki/Medicinehttp://en.wikipedia.org/wiki/Engineeringhttp://en.wikipedia.org/wiki/Science


3/12


From which we readily obtain the marginal distribution

{

Hence the posterior distribution may be written

Note; the mean of the posterior distribution

denoted by *, is

called Bayes estimate of .

The density is called the posterior density.Consider the Bayes estimation of the probability p of an event where p is a realization of a

random variable X with probability density function whose range is .Prior estimates of p can b obtained from;

= . (1)To improve on the estimate of p, we conduct an experiment of tossing a die n times andobserving the number of aces to be k.

Applying Bayes theorem, the posterior density is written as;

.. (2)Where B = }From binomial probability, we obtain

() . (3)Substitute equation (3) into (2) () () . (4)


4/12


The updated estimate of p can be obtained by substituting from equation (4) for in(1).

= . (5)Assuming that is uniformly distributed in , instead of a general distribution in therange , equation (4) can be simplified.The integral ................ (6) can be shown to be true usingmathematical induction.

Substituting = 1 and from equation (6), we can evaluate as () . (7)We can express the conditional density as follows; .. (8)The posterior estimate for p is obtained from equation (5) as

= And from equation (6), , is given by = Theorem

Bayesian methods of estimation concerning the mean of normal population are based on the

following theorem.

If is the mean of the random sample of size n from a normal population with known thevariance 2 and the prior distribution of the population mean is a normal distribution with priormean o and prior variance

o

2, then the posterior distribution of the population mean is also

normal distribution with mean* and standard deviation *, where and The posterior mean * is the Bayes estimate of the population mean and 100(1-)% Bayesian

interval for can be constructed by computing the interval


5/12


, which is centered at the posterior mean and contains 100(1-)% of the posterior probability.

Example

An electrical firm manufactures light bulbs that have a length of life that is approximatelynormally distributed with a standard deviation of 100hours. Prior experience leads us to believe

that is a value of a normal random variable with a mean o = 800 hours and standard deviation

o = 10hours. If a random sample of 25 bulbs has an average life of 780 hours, find a 95%

Bayesian interval for .

Solution

The posterior distribution of the mean is also normally distributed with the mean

* = = 796and standard deviation

* = The 95% Bayesian interval for is then given by;

796 - 1.96 < < 796 + 1.96 778.5 < < 813.5By ignoring the prior information about in the example above, one can continue to constructthe classical 95% confidence interval

780-(1.96)


6/12


Disadvantages of Bayesian Estimation

i. You can get very different posterior distributions by changing what parameters haveuninformative priors. In other words, there are some tricky mechanical issues.

ii. The frequentist-based framework is ideal for the Popperian view of science because it allowsyou to falsify hypothesis. Under Bayesian statistics, there is no such thing as falsification, justrelative degrees of belief.

iii. Frequentist statistics is "easy" and has accepted conventions of method and notation. The samecant be said of Bayesian statistics. Bayesian requires understanding probability and likelihood.


7/12


VECTOR RANDOM VARIABLES

A random matrix (or random vector) is a matrix (vector) whose elements are random variables.

Its elements jointly distributed. Two random matrices X1 and X2are independent if the elementsof X1(as a collection of random variables) are independent of the elements of X2 but the elementswithin X1or X2 do not have to be independent. Similarly, a collection of random matrices X1, . . .

,Xkis independent if their respective collections of random elements are (mutually) independent.

(Again, the elements within any of the random matrices need not be independent.)Similarly, an infinite collection of random matrices is independent if every finite sub-collection

is independent.

Expectation (mean) of a random matrix

The expected value or mean of an m n random matrix X is the m n matrix E(X) whoseelements are the expected values of the corresponding elements of X, assuming that they all

exist. That is if;

X= Then

E(X)= Properties:

E (X) = E(X)

If X is square, E (tr(X)) = tr (E(X)) If a is a constant, E (aX) = a E(X) E (vec(X)) = vec (E(X))If A and B are constant matrices E(AXB) = AE(X) B E(X1 +X2) = E(X1)+E(X2) If X1 and X2 are independent, E(X1X2) = E(X1) E(X2)

Covariance

This is the relationship between two random variables. If 3 or more random variables are jointly

distributed, one must consider the covariance for all possible pairs.

The covariance of 3 jointly distributed random variables x, y and z is specifically the 3

covariances;xy for x and y, yz for y and z and xz for x and z. Thus, in dealing with mjointly

distributed random variables, it is convenient to collect them into a single vector. A random


8/12


vector is one whose components are jointly distributed random variables. Therefore, ifx1, x2,...,

xm arem jointly distributed random variables , the vector,

is a random vector;

where 1, 2,. m are mean values ofx1, x2,..., xm respectively, then

( =

= Noting that E [ = E( ( ) (the covariance of and ) and if , we obtain the symmetricmatrix below;

Note; The variance of the individual random variables form the main diagonal of xxxx is the variance-covariance matrix of XIf the random variables in X are uncorrelated, all covariance (off diagonal) elements of xx arezero and the matrix is diagonal. The relationship between the weight matrix W and the

corresponding variance matrix/covariance matrix, with subscripts added to indicate reference to

random vector X, is restated as;

Wxx = 02xx-1

02 is the reference variance.

Caution; If Wxx is non diagonal the simple weights calculated in;


9/12


W1 = 02/ 1

2

W2 = 02/ 0

2. (4-16)

Wm = 02/ m

2,

are not to be used as diagonal elements of Wxx , but only when Wxx is

diagonal are the weights calculated in (4-16) identical to the diagonal elements.

Example 1

Two observations are represented by the random vector;

X =

The variances X1 and X2are 12

and 22

respectively. The covariance of X1 and X2is 12, and the

correlation coefficient is 12.

(a)For a selected reference variance 02, derive the weight matrix of X in terms of the givenparameters.

(b)Show that the weights calculated in (4-16) are identical to the diagonal elements of theweight matrix only when 12 = 0

Solution;

(a)The weight of the matrix X is;Wxx= 0

2xx-1= 02 =/ ( 1222 - ) we know that 12 = 1212 and 1

22

2- 12 = 1

22

2(1-)

thus; Wxx= 02/( 1

22

2(1 - 12

2)

=

(b) from 4-16

W1 = and W2=


10/12


The diagonal elements of Wxx areand

only when = 0When =/ 0, the weights W1 and W2 cannot be used as diagonal elements of Wxx . Each of the

xx can be divided by

to yield a scaled version of

called Qxx( co-factor of matrix of X)

Qxx = xx =[

]

xx = QxxQxx is also called the relative co=variance matrix.

The variance-covariance matrix (or covariance matrix) of an m random vector x is the m matrix V(x)(or Var(x) or cov(x)) defined byV(X)=E((x-E(X))(X-E(x))) when the expectations all exist

Also if

And, in particular, V(x) is symmetrical and diagonal if the elements of x are independent.

Properties

If a is constant, If A is a constant matrix and b a constant vector, ( is always non

negative definite)

The covariance between the

random variables x1 and the

random variable vector x2 is

defined to be matrix. , when all expectations exist; if a and b are constants. If A and B area constant matrices and c and d are constant vectors;


11/12


and

Conditional expectationThe conditional expectation of two random matrices X1and X2; E(X1/X2) (of X1 given X2=A) is

the expectation of X1defined using the conditional distribution of its elements given X2=A (A

being a constant matrix). The conditional expectation E(X1/X2), is the expectation of X1 defined

using the conditional distribution of its elements given X2.

The double expectation formula is E (E(X1/X2))=E(X1)

The conditional variance-covariance matrix V(x1/X2=A) or V(x1/X2) for a random vector x1 is

defined by replacing conditional expectations into the definition of the variance-covariance

matrix appropriately.

The conditional covariance formula applies:

V(x1) = E(V(x1/X2) + V(E(x1/X2))

For random vectors x1 and x2, the conditional covariance Cov(x1,x2/x3=A) or cov(x1,x2/x3) can be

defined by putting the appropriate conditional expectations into the definition of the covariance.

An additional covariance formula is cov(x1x2)= E(cov(x1,x2/X3))+cov(E(x1/X3),E(x2/X3)


12/12


REFERENCES

Probability and statistics for engineers and scientists, 6th edition by Walpole. Myers, MyersRonald E Walpole, Raymond H.Myers, Sharon L.Myers Page 275-280.

Analysis and adjustment of survey measurements by Edward M. Mikhai, School ofEngineering, Purdue University West Lafayette, India.

Probability and random processes by Venkatarama Krishna 2006 published by John Wileyand Sons, pages 384-405.

Amos Storkey. Mlpr lectures: Distributions and models.http://www.inf.ed.ac.uk/teaching/courses/mlpr/lectures/distnsandmodels-

print4up.pdf, 2009. School ofInformatics, University ofEdinburgh.
http://www.inf.ed.ac.uk/teaching/courses/mlpr/lectures/distnsandmodels-http://www.inf.ed.ac.uk/teaching/courses/mlpr/lectures/distnsandmodels-http://www.inf.ed.ac.uk/teaching/courses/mlpr/lectures/distnsandmodels-

Documents

Bayesian Methods of Estimation