Dis Person

Embed Size (px)

Citation preview

  • 8/9/2019 Dis Person

    1/22

    Measures of Dispersion

    Prof G R C Nair

  • 8/9/2019 Dis Person

    2/22

    Need

    Central value alone do not give a truepicture of the distribution

    Dispersion measures how the values arescattered around the central value.

    A low value shows they are all clusteredclose to the central value.

    A high value shows they are all scatteredaway from the central value

  • 8/9/2019 Dis Person

    3/22

    Range

    Range = Largest value Smallest

    value = (L- S) Simple, Easy , Quick

    Not Accurate/reliable measure, not

    based on all data, influenced by extremevalues

  • 8/9/2019 Dis Person

    4/22

    Given any set of numerical observations,order them according to magnitude.

    The P th percentile in the ordered set isthat value below which lie P % (P percent)of the observations in the set.

    The position of the P th percentile is givenby (n+1)P/100, where n is the total numberof observations in the set.

    Percentiles & Quartiles

  • 8/9/2019 Dis Person

    5/22

    A large department storecollects data on sales made byeach of its salespeople. Thenumber of sales made on a givenday by each of 2020 salespeople is

    shown on the next slide. Also, thedata has been sorted inmagnitude.

  • 8/9/2019 Dis Person

    6/22

    Sales Sorted Sales

    9 66 912 1010 1213 1315 1416 1414 1514 1616 1617 1616 1724 17

    21 1822 1818 1919 2018 2120 2217 24

  • 8/9/2019 Dis Person

    7/22

    To find the 90th percentile, determine thedata point in position (n+1)P/100 = (20+1)

    (90/100) = 18.9. Thus, the percentile is located at the 18.9thposition.

    The 18th observation is 21, and the 19th

    observation is also 22. The 90th percentile is a point lying 0.9 ofthe way from 21 to 22 and is thus 21.9.

    Percentiles

  • 8/9/2019 Dis Person

    8/22

    Quartiles are the percentage points that breakdown the ordered data set into quarters. The first quartile is the 25th percentile. It isthe point below which lie 1/4 of the data.

    The second quartile is the 50th percentile. Itis the point below which lie 1/2 of the data. Thisis also called the median.

    The third quartile is the 75th percentile. It is

    the point below which lie 3/4 of the data.

    The interquartile range is the differencebetween the third and the first quartiles.

    Quartiles

  • 8/9/2019 Dis Person

    9/22

    Quartile Deviation

    Quartile Deviation is (Q3-Q1)/2 For grouped data,Q1= L1+(0.25N-C)h/fQ3= L3+(0.75N-C)h/f, where,

    L1=Lower boundary of first quartile class L3=Lower boundary of third quartile class N= total frequency; h= Class width

    C=Cum frequency up to the lower limit ofthe concerned quartile class

    f = frequency of the concerned quartileclass

  • 8/9/2019 Dis Person

    10/22

    Sales Sorted Sales

    9 66 912 1010 1213 1315 14

    16 1414 1514 1616 1617 1616 1724 17

    21 1822 1818 1919 2018 2120 22

    17 24

    First QuartileFirst Quartile

    MedianMedian

    Third QuartileThird Quartile

    (n+1)P/100(n+1)P/100

    (20+1) 1/4=5.25

    (20+1) 1/2=10.5

    (20+1) 3/4=15.75

    13+(.25)(1)=13.25

    16+(.5)(0) = 16

    18+(.75)(1)=18.75

    QuartilesQuartiles

    Example

    Position

  • 8/9/2019 Dis Person

    11/22

    SortedSales Sales Rank9 6 16 9 212 10 310 12 413 13 515 14 616 14 714 15 814 16 916 16 1017 16 1116 17 1224 17 1321 18 1422 18 1518 19 1619 20 1718 21 1820 22 1917 24 20

    First Quartile

    Third Quartile

    Q1 = 13 + (.25)(1) = 13.25

    Q3 = 18+ (.75)(1) = 18.75

    Minimum

    Maximum

    Range Maximum-Minimum24 - 6 = 18

    InterquartileRange

    Q3 - Q1 =18.75 - 13.25 = 5.5

    Range and Interquartile Range

    QuartileDeviation

    (Q3 - Q1)/2 = 2.75

  • 8/9/2019 Dis Person

    12/22

    ean deviation

    Mean deviation !7 |x Q|/N

    For grouped data,7 f |x Q|/N

    Simple, easy, considers all data,

    But less reliable as it ignores sign, notconducive for mathematical treatment

  • 8/9/2019 Dis Person

    13/22

    Variance

    The variance is the average of

    the squared deviations from thepopulation mean.

    All values are used in the calculation.Not influenced by extreme values.The units are awkward, the square ofthe original units.

  • 8/9/2019 Dis Person

    14/22

  • 8/9/2019 Dis Person

    15/22

    The ages of afamily are: 2, 18, 34, 42

    What is thevariance?

    24

    4

    96!!

    7!

    N

    XQ

    2364

    944

    4

    2442...242)( 222

    2

    !!

    !

    7!

    N

    X Q

    W

  • 8/9/2019 Dis Person

    16/22

    Standard Deviation

    Standard deviation is the squareroot of the variance.

    Find the Standard deviation for thelast problem

    36.152362

    !!! WW

  • 8/9/2019 Dis Person

    17/22

    Short cut Formula

    Std deviation

    Where, d = (m - A) / i

    i = Class interval m = mid value of class

    A = Assumed Mean

    Wfd 2

    = ix [ ]-{7 7fd

    2

    N N }

    1/2

  • 8/9/2019 Dis Person

    18/22

    Example HWans -hidden

    A factory produces bulbs, whose length of lifewas found to be as given in the table below.Find the mean life and the std deviation bynormal and by short cut method.

    Life No of lamps 500-700 5 700-900 11 900-1100 26

    1100-1300 10 1300-1500 8 Mean =1016.67 Assumed mean = 1000

  • 8/9/2019 Dis Person

    19/22

    X d d2 f fd fd2

    600 -2 4 5 -10 20 800 -1 1 11 -11 11

    1000 0 0 26 0 0

    1200 1 1 10 10 10

    1400 2 4 8 16 32

    7 60 5 73

    Std deviation = 200{( 73/60) - (5/60)2}1/2

    = 219.8

  • 8/9/2019 Dis Person

    20/22

    Coefficient ofVariation

    The coefficient ofvariation is theratio of thestandard deviationto the arithmeticmean, expressedas a percentage:

    CV

    W

    !

    (100%)

    Q

  • 8/9/2019 Dis Person

    21/22

    Combined Values

    Combined Mean of 2 groups

    Q = Q1N1+ Q2 N2

    N1+N2Combined Variance

    W 2 = N1 W12 + N2 W22 + N1d12 + N2d22

    N1+N2

    where, d1= QQd2 = QQ

  • 8/9/2019 Dis Person

    22/22

    Chebyshevs Rule

    Irrespective of the shape of thedistribution curve, at least 75 % ofvalues will fall between +/- 2W and89% within +/- 3W from the mean.

    % age data with in +/- k times s of

    the mean will be at least (1-1/k2)x100