Upload
tom-kigezi
View
224
Download
0
Embed Size (px)
Citation preview
8/2/2019 Bayesian Methods of Estimation
1/12
MAKERERE UNIVESITY
COLLEGE OF ENGINEERING DESIGN ART AND TECHNOLOGY
DEPARTMENT OF CIVIL ENGINEERING
Math assignment
Group members
NAMES REGISTRATION
NUMBER
STUDENT NUMBER
OLARA ALLAN 10/U/683 210001123
ARIKOD RICHARD 10/U/657 210001135
MUKIIZA JULIUS 10/U/671 210001151
NDARAMA MICHEAL
SIMON
10/X/3007/PSA 210004611
TWEHEYO DISHAN 10/U/690 210001016
BULUMA MELINDA 10/U/662 210000809
NAMIYA MARIAM 10/U/676 210000345
SSONKO EMMANUEL 10/U/9979/PSA 210005498
BUYINZA ABBEY 10/U/663 210000879
ARIKE PATRICK 10/U/9989/PSA 210018734
TUKAMUSHABA EMMY 08/U/3053/PSA 208006302
ANGURA GABRIEL 10/U/9946/PSA 210018907
SEMAHORO ALLAN 10/U/9945/PSA 210006460
NANKABIRWA ROSE 10/U/678 210000348
MUTONGOLE SAMUEL 10/U/9965/PSA 210009531
NAMPEERA ROBINAH 10/U/677 210001032
KIGONYA ALLAN 10/U/668 210000683
OCAN GEOFREY 10/U/9971/PSA 210017525
MUTYOGOMA MAUYA 10/U/9998/PS 210006589
DRATELE SIGFRIED BUDRA 10/U/1914 210001946
MUKIIBI SSEMAKULA
PETER
10/U/9964/PSA 210006993
KINENE SERWANGA BRIAN 10/U/687 210001541
OLUKA PATRICK 10/U/1002/PS 210006598
8/2/2019 Bayesian Methods of Estimation
2/12
Group 5 math assignment Page 1
BAYESIAN ESTIMATION OF DISTRIBUTION PARAMETERS
IntroductionBayes theorem is a theorem with two distinct interpretations. In the Bayesian interpretation, it
expresses how a subjective degree of belief should rationally change to account for evidence. Infrequents interpretation, it relates inverse representations of the probabilities concerning two
events Bayesian statistics has applications in fields including science, engineering, medicine and
law.
Basics of the Bayesian estimation methodConsider the problem of finding a point estimate of the parameter for the population f(x: ).
The classical approach would be take a random sample of size n and substitute the information
provided by the sample into the appropriate estimator or decision function.
For the case of binomial population b(x:n,p) the estimate ofp, the proportion of success, wouldbe .Suppose that additional information is given about parameter , namely, that is known to vary
according to some to probability distribution f(), often called prior distribution, with prior
mean o and prior variance o2. That is we are now assuming to be a value of a randomvariable with probability distribution f() and we wish to estimate the particular value of for
the population from which we selected our sample.
The probabilities associated with this prior distribution are called subjective probabilities, in
that they measure a persons degree of beliefin the location of the parameter.
Bayesian techniques use the prior distribution f() along with joint distribution the sample to compute the posterior distribution The posterior distribution consists of information from both subjective-prior distribution and the
objective sample distribution and express the degree of belief in the location of parameter afteryou have observed the sample.
If we denote by the joint probability distribution of the sample,conditional on the parameter in a situation where is a random variable; The joint distribution of the sample and then the parameter is then = f()
http://en.wikipedia.org/wiki/Sciencehttp://en.wikipedia.org/wiki/Engineeringhttp://en.wikipedia.org/wiki/Medicinehttp://en.wikipedia.org/wiki/Lawhttp://en.wikipedia.org/wiki/Lawhttp://en.wikipedia.org/wiki/Medicinehttp://en.wikipedia.org/wiki/Engineeringhttp://en.wikipedia.org/wiki/Science8/2/2019 Bayesian Methods of Estimation
3/12
Group 5 math assignment Page 2
From which we readily obtain the marginal distribution
{
Hence the posterior distribution may be written
Note; the mean of the posterior distribution
denoted by *, is
called Bayes estimate of .
The density is called the posterior density.Consider the Bayes estimation of the probability p of an event where p is a realization of a
random variable X with probability density function whose range is .Prior estimates of p can b obtained from;
= . (1)To improve on the estimate of p, we conduct an experiment of tossing a die n times andobserving the number of aces to be k.
Applying Bayes theorem, the posterior density is written as;
.. (2)Where B = }From binomial probability, we obtain
() . (3)Substitute equation (3) into (2) () () . (4)
8/2/2019 Bayesian Methods of Estimation
4/12
Group 5 math assignment Page 3
The updated estimate of p can be obtained by substituting from equation (4) for in(1).
= . (5)Assuming that is uniformly distributed in , instead of a general distribution in therange , equation (4) can be simplified.The integral ................ (6) can be shown to be true usingmathematical induction.
Substituting = 1 and from equation (6), we can evaluate as () . (7)We can express the conditional density as follows; .. (8)The posterior estimate for p is obtained from equation (5) as
= And from equation (6), , is given by = Theorem
Bayesian methods of estimation concerning the mean of normal population are based on the
following theorem.
If is the mean of the random sample of size n from a normal population with known thevariance 2 and the prior distribution of the population mean is a normal distribution with priormean o and prior variance
o
2, then the posterior distribution of the population mean is also
normal distribution with mean* and standard deviation *, where and The posterior mean * is the Bayes estimate of the population mean and 100(1-)% Bayesian
interval for can be constructed by computing the interval
8/2/2019 Bayesian Methods of Estimation
5/12
Group 5 math assignment Page 4
, which is centered at the posterior mean and contains 100(1-)% of the posterior probability.
Example
An electrical firm manufactures light bulbs that have a length of life that is approximatelynormally distributed with a standard deviation of 100hours. Prior experience leads us to believe
that is a value of a normal random variable with a mean o = 800 hours and standard deviation
o = 10hours. If a random sample of 25 bulbs has an average life of 780 hours, find a 95%
Bayesian interval for .
Solution
The posterior distribution of the mean is also normally distributed with the mean
* = = 796and standard deviation
* = The 95% Bayesian interval for is then given by;
796 - 1.96 < < 796 + 1.96 778.5 < < 813.5By ignoring the prior information about in the example above, one can continue to constructthe classical 95% confidence interval
780-(1.96)
8/2/2019 Bayesian Methods of Estimation
6/12
Group 5 math assignment Page 5
Disadvantages of Bayesian Estimation
i. You can get very different posterior distributions by changing what parameters haveuninformative priors. In other words, there are some tricky mechanical issues.
ii. The frequentist-based framework is ideal for the Popperian view of science because it allowsyou to falsify hypothesis. Under Bayesian statistics, there is no such thing as falsification, justrelative degrees of belief.
iii. Frequentist statistics is "easy" and has accepted conventions of method and notation. The samecant be said of Bayesian statistics. Bayesian requires understanding probability and likelihood.
8/2/2019 Bayesian Methods of Estimation
7/12
Group 5 math assignment Page 6
VECTOR RANDOM VARIABLES
A random matrix (or random vector) is a matrix (vector) whose elements are random variables.
Its elements jointly distributed. Two random matrices X1 and X2are independent if the elementsof X1(as a collection of random variables) are independent of the elements of X2 but the elementswithin X1or X2 do not have to be independent. Similarly, a collection of random matrices X1, . . .
,Xkis independent if their respective collections of random elements are (mutually) independent.
(Again, the elements within any of the random matrices need not be independent.)Similarly, an infinite collection of random matrices is independent if every finite sub-collection
is independent.
Expectation (mean) of a random matrix
The expected value or mean of an m n random matrix X is the m n matrix E(X) whoseelements are the expected values of the corresponding elements of X, assuming that they all
exist. That is if;
X= Then
E(X)= Properties:
E (X) = E(X)
If X is square, E (tr(X)) = tr (E(X)) If a is a constant, E (aX) = a E(X) E (vec(X)) = vec (E(X))If A and B are constant matrices E(AXB) = AE(X) B E(X1 +X2) = E(X1)+E(X2) If X1 and X2 are independent, E(X1X2) = E(X1) E(X2)
Covariance
This is the relationship between two random variables. If 3 or more random variables are jointly
distributed, one must consider the covariance for all possible pairs.
The covariance of 3 jointly distributed random variables x, y and z is specifically the 3
covariances;xy for x and y, yz for y and z and xz for x and z. Thus, in dealing with mjointly
distributed random variables, it is convenient to collect them into a single vector. A random
8/2/2019 Bayesian Methods of Estimation
8/12
Group 5 math assignment Page 7
vector is one whose components are jointly distributed random variables. Therefore, ifx1, x2,...,
xm arem jointly distributed random variables , the vector,
is a random vector;
where 1, 2,. m are mean values ofx1, x2,..., xm respectively, then
( =
= Noting that E [ = E( ( ) (the covariance of and ) and if , we obtain the symmetricmatrix below;
Note; The variance of the individual random variables form the main diagonal of xxxx is the variance-covariance matrix of XIf the random variables in X are uncorrelated, all covariance (off diagonal) elements of xx arezero and the matrix is diagonal. The relationship between the weight matrix W and the
corresponding variance matrix/covariance matrix, with subscripts added to indicate reference to
random vector X, is restated as;
Wxx = 02xx-1
02 is the reference variance.
Caution; If Wxx is non diagonal the simple weights calculated in;
8/2/2019 Bayesian Methods of Estimation
9/12
Group 5 math assignment Page 8
W1 = 02/ 1
2
W2 = 02/ 0
2. (4-16)
Wm = 02/ m
2,
are not to be used as diagonal elements of Wxx , but only when Wxx is
diagonal are the weights calculated in (4-16) identical to the diagonal elements.
Example 1
Two observations are represented by the random vector;
X =
The variances X1 and X2are 12
and 22
respectively. The covariance of X1 and X2is 12, and the
correlation coefficient is 12.
(a)For a selected reference variance 02, derive the weight matrix of X in terms of the givenparameters.
(b)Show that the weights calculated in (4-16) are identical to the diagonal elements of theweight matrix only when 12 = 0
Solution;
(a)The weight of the matrix X is;Wxx= 0
2xx-1= 02 =/ ( 1222 - ) we know that 12 = 1212 and 1
22
2- 12 = 1
22
2(1-)
thus; Wxx= 02/( 1
22
2(1 - 12
2)
=
(b) from 4-16
W1 = and W2=
8/2/2019 Bayesian Methods of Estimation
10/12
Group 5 math assignment Page 9
The diagonal elements of Wxx areand
only when = 0When =/ 0, the weights W1 and W2 cannot be used as diagonal elements of Wxx . Each of the
xx can be divided by
to yield a scaled version of
called Qxx( co-factor of matrix of X)
Qxx = xx =[
]
xx = QxxQxx is also called the relative co=variance matrix.
The variance-covariance matrix (or covariance matrix) of an m random vector x is the m matrix V(x)(or Var(x) or cov(x)) defined byV(X)=E((x-E(X))(X-E(x))) when the expectations all exist
Also if
And, in particular, V(x) is symmetrical and diagonal if the elements of x are independent.
Properties
If a is constant, If A is a constant matrix and b a constant vector, ( is always non
negative definite)
The covariance between the
random variables x1 and the
random variable vector x2 is
defined to be matrix. , when all expectations exist; if a and b are constants. If A and B area constant matrices and c and d are constant vectors;
8/2/2019 Bayesian Methods of Estimation
11/12
Group 5 math assignment Page 10
and
Conditional expectationThe conditional expectation of two random matrices X1and X2; E(X1/X2) (of X1 given X2=A) is
the expectation of X1defined using the conditional distribution of its elements given X2=A (A
being a constant matrix). The conditional expectation E(X1/X2), is the expectation of X1 defined
using the conditional distribution of its elements given X2.
The double expectation formula is E (E(X1/X2))=E(X1)
The conditional variance-covariance matrix V(x1/X2=A) or V(x1/X2) for a random vector x1 is
defined by replacing conditional expectations into the definition of the variance-covariance
matrix appropriately.
The conditional covariance formula applies:
V(x1) = E(V(x1/X2) + V(E(x1/X2))
For random vectors x1 and x2, the conditional covariance Cov(x1,x2/x3=A) or cov(x1,x2/x3) can be
defined by putting the appropriate conditional expectations into the definition of the covariance.
An additional covariance formula is cov(x1x2)= E(cov(x1,x2/X3))+cov(E(x1/X3),E(x2/X3)
8/2/2019 Bayesian Methods of Estimation
12/12
Group 5 math assignment Page 11
REFERENCES
Probability and statistics for engineers and scientists, 6th edition by Walpole. Myers, MyersRonald E Walpole, Raymond H.Myers, Sharon L.Myers Page 275-280.
Analysis and adjustment of survey measurements by Edward M. Mikhai, School ofEngineering, Purdue University West Lafayette, India.
Probability and random processes by Venkatarama Krishna 2006 published by John Wileyand Sons, pages 384-405.
Amos Storkey. Mlpr lectures: Distributions and models.http://www.inf.ed.ac.uk/teaching/courses/mlpr/lectures/distnsandmodels-
print4up.pdf, 2009. School ofInformatics, University ofEdinburgh.
http://www.inf.ed.ac.uk/teaching/courses/mlpr/lectures/distnsandmodels-http://www.inf.ed.ac.uk/teaching/courses/mlpr/lectures/distnsandmodels-http://www.inf.ed.ac.uk/teaching/courses/mlpr/lectures/distnsandmodels-