Statistics II

  • Upload
    gambodc

  • View
    219

  • Download
    0

Embed Size (px)

Citation preview

  • 7/28/2019 Statistics II

    1/13

    Physics116C, 4/14/06

    D. Pellett

    Introduction toStatistics and Error

    Analysis II

    References:Data Reduction and Error Analysis for the Physical Sciences by Bevington and Robinson

    Particle Data Group notes on probability and statistics, etc.online athttp://pdg.lbl.gov/2004/reviews/contents_sports.html(Reference is S. Eidelman et al, Phys. Lett. B 592, 1 (2004))

    Any original presentations copyright D. Pellett 2006

    1

    http://pdg.lbl.gov/2004/reviews/contents_sports.htmlhttp://pdg.lbl.gov/2004/reviews/contents_sports.html
  • 7/28/2019 Statistics II

    2/13

    Topics

    Propagation of errors examples

    Background subtraction

    Error for x = a ln(bu)

    Estimates of mean and error

    method of maximum likelihood and error of the mean

    Weighted mean, error of weighted mean

    Confidence intervals

    Chi-square (2) distribution and degrees of freedom ()

    Histogram 2

    Format for lab writeup

    Next - least squares fit to a straight line and errors of parameters2

  • 7/28/2019 Statistics II

    3/13

    Overview: Propagation of Errors

    Brief overview Suppose we havex = f(u,v) and u and v are uncorrelated

    random variables with Gaussian distributions.

    Expand fin a Taylors series aroundx0 = f(u0,v0) where u0,v0are the mean values ofu and v, keeping the lowest orderterms:

    The distribution ofx is a bivariate distribution in u and v.

    Under suitable conditions (see Bevington Ch. 3) we canapproximate x (the standard deviation ofx) by

    x2

    f

    u

    2

    u2+

    f

    v

    2

    v2

    x x x0 =

    f

    u

    u+

    f

    v

    v

    3

  • 7/28/2019 Statistics II

    4/13

    Bevington Problem 2.8 Solution

    Cars at fork in road: two choices, right or left; p=0.75 for righton a typical day.

    Binomial distribution 1035 cars

    Mean = np = 1035x0.75 = 776.25,

    x = 752 cars turned right on a particular day

    Is this consistent with being a typical day?

    Find probability for a deviation from the mean this large orlarger, either positive or negative

    Could sum PB from 0 to 752 and from 800 to 1035

    Can approximate with Gaussian since n, x large, x still many less that n. (19 to be exact.)

    PB(x;1035, 0.75) =1035!

    x!(1035 x)!

    3

    4

    x14

    1035x

    = npq= 13.9

    4

  • 7/28/2019 Statistics II

    5/13

    Bevington 2.8 Solution (continued)

    Use Gaussian approximation with same ,. Transform interval to unit normal distribution in Limits of interval z, Probability for z within this interval given in Table C.2 of

    Bevington (probability of being within 1.745of):P(-1.745 < z 1.745) = 0.9190

    Probability that this is a typical day = 1- P = 0.081 (or 8.1%) Confidence interval: the above interval is a 91.9% confidence

    interval for z

    z should lie within this interval 91.9% of the time. Larger probability interval required to exclude hypothesis

    3would correspond to a 99.73% confidence interval

    Be careful: distribution might not be truly Gaussian

    z = (x )/.|z| = |752776.25|/13.9 = 1.745.

    5

  • 7/28/2019 Statistics II

    6/13

    Bevington 2.8 (concluded)

    You could also use the cumulative distribution function to find P

    This is available in LabVIEW:

    This calculates

    P(z 1.745) = 1.745

    12

    exp12z2 = 0.0405

    so the probability of this being a typical day is 2(.0405)=0.081, as above.

    6

  • 7/28/2019 Statistics II

    7/13

    Chi-Square Test of a Distribution

    Define

    For Gaussian random variables xi, it can be shown that thisfollows the chi-square distribution with degrees of freedom.

    The expectation value (i.e., mean value) for is

    If this is based on a fit where some parameters are determinedfrom the fit, the number of degrees of freedom is reduced by thenumber of fitted parameters

    One can set confidence intervals in to evaluate the goodness of

    fit

    distributions can be found from tables, graphs or LabVIEW VIsas was done for Gaussian probability. (See Table C.4 in Bevington)

    Can be approximated by Gaussian under suitable conditions

    2

    N

    i=1

    xi ii

    2

    2

    2

    2

    7

  • 7/28/2019 Statistics II

    8/13

    Comments on 2 of Histogram

    Can model this with N individual mutually independent binomials, so long

    as a fixedtotal is not required. Then normalize to the actual Ntotal, using 1degree of freedom in the fit (N = number of bins)

    (see discussion in Bevington, Ch. 3 and Prob. 4.13 solution on next page)

    For small ni, large Ntotal, large number of bins, ni is approx. Poisson

    2 2 distribution if all ni>>1(ni 5 OK) or N >>1

    For fixed Ntotal, model with multinomial distribution (see p. 12)

    But niare not mutually independent with multinomial

    2

    N

    i=1

    (Ni ni)2

    ni

    >

    8

  • 7/28/2019 Statistics II

    9/13

    Histogram Chi-Square: Bevington Prob. 4.13

    4.13) I made a LabVIEW VI (see figures on next page) to plot the histogram, calculate the Gaussiancomparison values and find 2 according to Eq. 4.33 of Bevington:

    2 =

    nj=1

    [h(xj)NP(xj)]2

    NP(xj)

    where n is the number of bins, N is the total number of trials, h(xj) is the contents of the jth binand NP(xj) is the expected contents from the Gaussian distribution (see the text for details).

    Assume the bins are small enough so we can approximate the integral of the p.d.f. over the binwith the central value of the p.d.f. times the bin width. Then

    NP(xj) = N

    xj

    p(x) dx Np(xj)x

    where p(xj) is the Gaussian p.d.f. evaluated at the center of the jth bin andx = 2 is the binwidth.

    p(xj) =1

    2

    exp

    1

    2

    xj

    2

    9

  • 7/28/2019 Statistics II

    10/13

    Histogram Chi-Square Result

    10

  • 7/28/2019 Statistics II

    11/13

    Histogram Chi-Square: Comments

    This analysis assumes the contents of each histogram bin hj is an independent measurement. and are given but the total number of trials is taken to agree with the experiment (200 trials).This represents one constraint and reduces the number of degrees of freedom by 1. We have 13bins to compare with the Gaussian and = 12 degrees of freedom.

    The expectation value for 2 equals the number of degrees of freedom, < 2 >= 12. The resulting

    2

    = 8.28 (disagrees with the answer in the book but was checked independently). The probabilityfor exceeding this value of 2 is 0.76 (calculated by LabVIEW but agrees with interpolated value

    from Table C.4). The reduced 2

    = 0.69 (expectation value of 1). This is not a bad fit. Ofcourse, the 2 distribution is only valid for underlying Gaussians and this is not a good assumptionfor the bins with low occupancy.

    11

  • 7/28/2019 Statistics II

    12/13

    Multinomial Distribution

    http://mathworld.wolfram.com/MultinomialDistribution.html

    Histogram with n bins, N total counts (partition N events into n

    bins),xicounts in ith bin, with probability pito get a count in ith bin

    n

    i=1

    xi = N,n

    i=1

    pi = 1

    P(x1, x2, . . . , xn) =N!

    x1!x2! . . . xn!px1

    1px2

    2. . . pxn

    n

    with

    i = Npi, i2 = Npi(1pi), ij

    2 = Npipj

    .

    Then

    Reference:

    12

    http://mathworld.wolfram.com/MultinomialDistribution.htmlhttp://mathworld.wolfram.com/MultinomialDistribution.html
  • 7/28/2019 Statistics II

    13/13

    Complete Lab Writeup

    Similar to research report. Outline as follows: Abstract (very brief overview stating results) Introduction (Overview and theory related to experiment) Experimental setup and procedure

    Analysis of data and results with errors Graphs should have axes labeled with units, usually points

    should have error bars and the graph should have a captionexplaining briefly what is plotted

    Comparison with accepted values, discussion of results anderrors; conclusions, if any.

    References Have draft/outline of paper and preliminary calculations at lab

    time Wednesday

    13