HUL215-lecture2.pdf

Embed Size (px)

Citation preview

  • 7/27/2019 HUL215-lecture2.pdf

    1/14

    HUL 215 - 2013-14 I Sem

    Review of Statistical Theory

    p Fundamentals of Probabilityn Random variable

    n Probability distributions

    n

    Joint distributions, Conditional Distributions and Independencen Features of Probability Distributions

    n Features of Joint and Conditional Distributions

    n Some important distributions (Normal, Chi-square, t, F)

    p Fundamentals of Mathematical Statisticsn Population, Parameters, and Random Sampling

    n Finite sample properties of estimators

    n Parameter estimation, Confidence Intervals, Hypothesis testing

  • 7/27/2019 HUL215-lecture2.pdf

    2/14

    HUL 215 - 2013-14 I Sem

    Basic conceptsp An experiment is any procedure that can, at least in theory, be infinitely

    repeated, and has a well-defined set of outcomes.

    p A random variable is one that takes on numerical values and has an outcome that

    is determined by an experiment (coin-flipping).

    p A random variable that can only take on the values zero and one is called aBernoulli (or binary) random variable.

    p A discrete random variable is one that takes on only a finite or countably infinite

    number of values.

    pj = P(X=xj), j=1,2,, k, where eachpj is between 0 and 1, and

    p1 + p2+ +pk=1.

    p The probability density function (pdf) ofXsummarizes the information

    concerning the possible outcomes ofXand the corresponding probabilities:

    f(xj) = pj, j 1,2,,k, withf(x)=0 for any x not equal toxj for some j.

  • 7/27/2019 HUL215-lecture2.pdf

    3/14

    HUL 215 - 2013-14 I Sem

    p A variable X is a continuous random variable if it takes on any real value with

    zero probability.

    p When computing probabilities for continuous random variables, it is easiest to work

    with the cumulative distribution function (cdf).

    F(x) P(Xx).

    p For any numberc, P(X >c) =1 - F(c).

    p For any numbers a < b, P(a < X < b) = F(b) - F(a).

  • 7/27/2019 HUL215-lecture2.pdf

    4/14

    HUL 215 - 2013-14 I Sem

    Joint Distributions and Independence

    p LetXandYbe discrete random variables. Then, (X,Y ) have ajoint

    distribution, which is fully described by the joint probability density function of

    (X,Y ):

    fX,Y(x,y) = P(X = x,Y = y),

    where the right-hand side is the probability thatX = x and Y = y.

    p Random variablesXandYare said to be independent if and only if

    fX,Y(x,y) = fX(x)fY(y)

    for allx andy, wherefXis the pdf ofX, andfYis the pdf ofY.

    p In the context of more than one random variable, the pdfsfXandfYare often calledmarginal probability density functions to distinguish them from the joint pdffX,Y.

    p IfXandYare independent, then knowing the outcome ofXdoes

    not change the probabilities of the possible outcomes ofY, and vice versa.

  • 7/27/2019 HUL215-lecture2.pdf

    5/14

    HUL 215 - 2013-14 I Sem

    Conditional Distributions

    p Conditional probability density function, defined by

    fY|X(y|x) = fX,Y(x,y)/fX(x) for all values ofx such that fX(x) > 0.

    p WhenXandYare discretefY|X(y|x) = P(Y = y|X = x),Right-hand side is read as the probability that Y = y given that X = x.

  • 7/27/2019 HUL215-lecture2.pdf

    6/14

    HUL 215 - 2013-14 I Sem

    Features of Probability Distribution

    p IfXis a random variable, the expected value (or expectation) ofX, denoted

    E(X) and sometimes Xor simply , is a weighted average of all possible

    values ofX. The weights are determined by the probability density function

    E(X) = x1f(x1) + x2f (x2) + + xkf(xk).p IfXis a continuous random variable, thenE(X) is defined as an integral:

    p Given a random variableXand a functiong(), we can create a new random

    variableg(X). The expected value ofg(X) is, again, simply a weightedaverage:

  • 7/27/2019 HUL215-lecture2.pdf

    7/14

    HUL 215 - 2013-14 I Sem

    Variance

    p Just as we needed a number to summarize the central tendency ofX, we

    need a number that tells us how farXis from , on average.

    p Variance is sometimes denoted X, or simply , when the context is clear

    p The standard deviation of a random variable, denotedsd(X), is simply the

    positive square root of the variance

    p Standardizing a Random Variable:

  • 7/27/2019 HUL215-lecture2.pdf

    8/14

    HUL 215 - 2013-14 I Sem

    Measures of Association: Covariance and Correlation

    p The covariancebetween two random variablesXandY, sometimes called

    the population covariance to emphasize that it concerns the relationship

    between two variables describing a population, is defined as the expected

    value of the product (X - X

    )(Y - Y

    ):

    p Covariance measures the amount oflinear dependencebetween two

    random variables.

    p IfXandYare independent, then Cov(X,Y ) = 0.

    p Zero covariance betweenXandYdoes not imply thatXandYare

    independent. (Try Y = X2 )

  • 7/27/2019 HUL215-lecture2.pdf

    9/14

    HUL 215 - 2013-14 I Sem

    Variance of Sums of Random Variables

    Variance properties (independent X, Y):

  • 7/27/2019 HUL215-lecture2.pdf

    10/14

    HUL 215 - 2013-14 I Sem

    Conditional Expectation

    p Suppose we know thatXhas taken on a particular value, sayx. Then, we

    can compute the expected value of Y, given that we know this outcome of

    X. We denote this expected value byE(Y|X = x), or sometimes E(Y|x) for

    shorthand.

    p When Yis a discrete random variable taking on values {y1,, ym}, then

    p Expected value ofcrime rate given literacy rate could be a linear

    function:

    CRIME = + LIT.

  • 7/27/2019 HUL215-lecture2.pdf

    11/14

    HUL 215 - 2013-14 I Sem

    Properties of Conditional Expectationp E[c(X)|X] = c(X), for any function c(X).

    p For functions a(X) andb(X), E[a(X)Y + b(X)|X] = a(X)E(Y|X) + b(X).

    p IfXandYare independent, thenE(Y|X) = E(Y).

    p Law of iterated expectations: E[E(Y|X)] = E(Y).

    p A more general case: E(Y|X) = E[E(Y|X,Z)|X].

    p If E(Y|X) = E(Y), then Cov(X,Y) = 0. In fact, every function ofXis

    uncorrelated with Y. (Converse is NOT true)

    n IfXandYare correlated, thenE(Y|X) must depend onX.

    n The conditional expectation captures the nonlinear relationship betweenXand

    Y whereas Correlation captures linear association. (remember the example ofY=X2 )

    p Quick exercise:

    n IfUandXare random variables such that E(U|X) = 0, then argue that E(U)

    =0, andUandXare uncorrelated.

  • 7/27/2019 HUL215-lecture2.pdf

    12/14

    HUL 215 - 2013-14 I Sem

    Conditional Variance

    p Var(Y|X = x) = E{[Y - E(Y|x)]2|x}= E(Y2|x) - [E(Y|x)]2.

    p IfXandYare independent, then Var(Y|X) = Var(Y).

  • 7/27/2019 HUL215-lecture2.pdf

    13/14

    HUL 215 - 2013-14 I Sem

    Some important distributions

    p Normal Distribution:

    n A normal random variable is a continuous random variable that can take on any

    value. Its probability density function has the familiar bell shape.

  • 7/27/2019 HUL215-lecture2.pdf

    14/14

    HUL 215 - 2013-14 I Sem

    Normal Distribution

    p Mathematically, the pdf is given by

    where = E(X) and 2 = Var(X). Written asX ~ Normal(,2).

    p Normal distribution is symmetric about its mean.

    p The normal distribution is sometimes called the Gaussian distribution after

    the famous statistician C. F. Gauss.

    p IfXis a positive random variable, such as income, andY = log(X) has anormal distribution, then we say thatXhas a lognormal distribution.