STAT 552 PROBABILITY AND STATISTICS II

  • Upload
    netis

  • View
    32

  • Download
    5

Embed Size (px)

DESCRIPTION

STAT 552 PROBABILITY AND STATISTICS II. INTRODUCTION Short review of S551. WHAT IS STATISTICS?. - PowerPoint PPT Presentation

Citation preview

  • *STAT 552PROBABILITY AND STATISTICS IIINTRODUCTIONShort review of S551

  • *WHAT IS STATISTICS?Statistics is a science of collecting data, organizing and describing it and drawing conclusions from it. That is, statistics is a way to get information from data. It is the science of uncertainty.

  • *BASIC DEFINITIONSPOPULATION: The collection of all items of interest in a particular study.

    VARIABLE: A characteristic of interest about each element of a population or sample.STATISTIC: A descriptive measure of a sampleSAMPLE: A set of data drawn from the population; a subset of the population available for observationPARAMETER: A descriptive measure of the population, e.g., mean

  • STATISTICStatistic (or estimator) is any function of a r.v. of r.s. which do not contain any unknown quantity. E.g. are statistics.

    are NOT.

    Any observed or particular value of an estimator is an estimate.

    *

  • *The set of all possible outcomes of an experiment is called a sample space and denoted by S.Determining the outcomes.Build an exhaustive list of all possible outcomes.Make sure the listed outcomes are mutually exclusive.

    Sample Space

  • RANDOM VARIABLESVariables whose observed value is determined by chanceA r.v. is a function defined on the sample space S that associates a real number with each outcome in S.Rvs are denoted by uppercase letters, and their observed values by lowercase letters.*

  • *DESCRIPTIVE STATISTICSDescriptive statistics involves the arrangement, summary, and presentation of data, to enable meaningful interpretation, and to support decision making.Descriptive statistics methods make use ofgraphical techniquesnumerical descriptive measures.

  • Types of data examples*

    Examples of types of dataQuantitativeContinuousDiscreteBlood pressure, height, weight, ageNumber of childrenNumber of attacks of asthma per weekCategorical (Qualitative)Ordinal (Ordered categories)Nominal (Unordered categories)Grade of breast cancerBetter, same, worseDisagree, neutral, agreeSex (Male/female)Alive or deadBlood group O, A, B, AB

  • * POPULATIONSAMPLEPROBABILITYSTATISTICAL INFERENCE

  • * PROBABILITY: A numerical value expressing the degree of uncertainty regarding the occurrence of an event. A measure of uncertainty.

    STATISTICAL INFERENCE: The science of drawing inferences about the population based only on a part of the population, sample.

  • Probability P : S [0,1]

    Probability domain rangefunction*

  • *THE CALCULUS OF PROBABILITIESIf P is a probability function and A is any set, thena. P()=0b. P(A) 1c. P(AC)=1 P(A)

  • *ODDSThe odds of an event A is defined by

    It tells us how much more likely to see the occurrence of event A.

  • ODDS RATIOOR is the ratio of two odds. Useful for comparing the odds under two different conditions or for two different groups, e.g. odds for males versus females.*

  • CONDITIONAL PROBABILITY(Marginal) Probability: P(A): How likely is it that an event A will occur when an experiment is performed? Conditional Probability: P(A|B): How will the probability of event A be affected by the knowledge of the occurrence or nonoccurrence of event B?If two events are independent, then P(A|B)=P(A)*

  • CONDITIONAL PROBABILITY*

  • BAYES THEOREMSuppose you have P(B|A), but need P(A|B).

    *

  • IndependenceA and B are independent iff P(A|B)=P(A) or P(B|A)=P(B)P(AB)=P(A)P(B)A1, A2, , An are mutually independent ifffor every subset j of {1,2,,n}

    E.g. for n=3, A1, A2, A3 are mutually independent iff P(A1A2A3)=P(A1)P(A2)P(A3) and P(A1A2)=P(A1)P(A2) and P(A1A3)=P(A1)P(A3) and P(A2A3)=P(A2)P(A3)

    *

  • DISCRETE RANDOM VARIABLESIf the set of all possible values of a r.v. X is a countable set, then X is called discrete r.v.The function f(x)=P(X=x) for x=x1,x2, that assigns the probability to each value x is called probability density function (p.d.f.) or probability mass function (p.m.f.)*

  • ExampleDiscrete Uniform distribution:

    Example: throw a fair die. P(X=1)==P(X=6)=1/6

    *

  • CONTINUOUS RANDOM VARIABLESWhen sample space is uncountable (continuous)Example: Continuous Uniform(a,b)

    *

  • CUMULATIVE DENSITY FUNCTION (C.D.F.)

    CDF of a r.v. X is defined as F(x)=P(Xx).*

  • JOINT DISCRETE DISTRIBUTIONSA function f(x1, x2,, xk) is the joint pmf for some vector valued rv X=(X1, X2,,Xk) iff the following properties are satisfied:f(x1, x2,, xk) 0 for all (x1, x2,, xk)and*

  • MARGINAL DISCRETE DISTRIBUTIONSIf the pair (X1,X2) of discrete random variables has the joint pmf f(x1,x2), then the marginal pmfs of X1 and X2 are*

  • CONDITIONAL DISTRIBUTIONSIf X1 and X2 are discrete or continuous random variables with joint pdf f(x1,x2), then the conditional pdf of X2 given X1=x1 is defined by

    For independent rvs,

    *

  • *EXPECTED VALUESLet X be a rv with pdf fX(x) and g(X) be a function of X. Then, the expected value (or the mean or the mathematical expectation) of g(X)

    providing the sum or the integral exists, i.e.,

  • *EXPECTED VALUESE[g(X)] is finite if E[| g(X) |] is finite.

  • *Laws of Expected ValueE(c) = cE(X + c) = E(X) + cE(cX) = cE(X) Laws of VarianceV(c) = 0V(X + c) = V(X)V(cX) = c2V(X)Laws of Expected Value and VarianceLet X be a rv and c be a constant.

  • EXPECTED VALUE*If X and Y are independent,The covariance of X and Y is defined as

  • EXPECTED VALUE *If X and Y are independent,The reverse is usually not correct! It is only correct under normal distribution.If (X,Y)~Normal, then X and Y are independent iff Cov(X,Y)=0

  • EXPECTED VALUE*If X1 and X2 are independent,

  • CONDITIONAL EXPECTATION AND VARIANCE*

  • CONDITIONAL EXPECTATION AND VARIANCE*(EVVE rule)Proofs available in Casella & Berger (1990), pgs. 154 & 158

  • *SOME MATHEMATICAL EXPECTATIONSPopulation Mean: = E(X)Population Variance: (measure of the deviation from the population mean) Population Standard Deviation: Moments:

  • * The Variance

  • MOMENT GENERATING FUNCTION *The m.g.f. of random variable X is defined asfor t (-h,h) for some h>0.

  • Properties of m.g.f.M(0)=E[1]=1

    If a r.v. X has m.g.f. M(t), then Y=aX+b has a m.g.f.

    M.g.f does not always exists (e.g. Cauchy distribution)*

  • CHARACTERISTIC FUNCTION *The c.h.f. of random variable X is defined asfor all real numbers t.C.h.f. always exists.

  • UniquenessTheorem: If two r.v.s have mg.f.s that exist and are equal, then they have the same distribution.If two r.v.s have the same distribution, then they have the same m.g.f. (if they exist)Similar statements are true for c.h.f.*

  • SOME DISCRETE PROBABILITY DISTRIBUTIONSPlease review: Degenerate, Uniform, Bernoulli, Binomial, Poisson, Negative Binomial, Geometric, Hypergeometric, Extended Hypergeometric, Multinomial*

  • SOME CONTINUOUS PROBABILITY DISTRIBUTIONSPlease review: Uniform, Normal (Gaussian), Exponential, Gamma, Chi-Square, Beta, Weibull, Cauchy, Log-Normal, t, F Distributions*

  • *TRANSFORMATION OF RANDOM VARIABLESIf X is an rv with pdf f(x), then Y=g(X) is also an rv. What is the pdf of Y?If X is a discrete rv, replace Y=g(X) whereever you see X in the pdf of f(x) by using the relation .If X is a continuous rv, then do the same thing, but now multiply with Jacobian. If it is not 1-to-1 transformation, divide the region into sub-regions for which we have 1-to-1 transformation.

  • CDF methodExample: LetConsider . What is the p.d.f. of Y?Solution:

    *

  • M.G.F. MethodIf X1,X2,,Xn are independent random variables with MGFs Mxi (t), then the MGF of is *

  • *THE PROBABILITY INTEGRAL TRANSFORMATIONLet X have continuous cdf FX(x) and define the rv Y as Y=FX(x). Then, Y ~ Uniform(0,1), that is, P(Y y) = y, 0
  • SAMPLING DISTRIBUTIONA statistic is also a random variable. Its distribution depends on the distribution of the random sample and the form of the function Y=T(X1, X2,,Xn). The probability distribution of a statistic Y is called the sampling distribution of Y.

  • *SAMPLING FROM THE NORMAL DISTRIBUTIONProperties of the Sample Mean and Sample VarianceLet X1, X2,,Xn be a r.s. of size n from a N(,2) distribution. Then,

  • *SAMPLING FROM THE NORMAL DISTRIBUTIONIf population variance is unknown, we use sample variance:

  • *SAMPLING FROM THE NORMAL DISTRIBUTIONThe F distribution allows us to compare the variances by giving the distribution of If X~Fp,q, then 1/X~Fq,p. If X~tq, then X2~F1,q.

  • *CENTRAL LIMIT THEOREM

    If a random sample is drawn from any population, the sampling distribution of the sample mean is approximately normal for a sufficiently large sample size. The larger the sample size, the more closely the sampling distribution of will resemble a normal distribution.

    Random Sample(X1, X2, X3, ,Xn)

  • *Sampling Distribution of the Sample Mean

    If X is normal, is normal.

    If X is non-normal, is approximately normally distributed for sample size greater than or equal to 30.

    ************************