Proba and Stat Lesson - Part 1

Embed Size (px)

Citation preview

  • 8/12/2019 Proba and Stat Lesson - Part 1

    1/25

  • 8/12/2019 Proba and Stat Lesson - Part 1

    2/25

    Statistics = is regarded as the Branch of

    Mathematics which involved in:

    1. The design of an experiments and other science of

    data collection,

    2. In manipulating and summarizing data which yield in

    wise decision making

    3. In coming-up with scientific conclusions from theresults of the data interpretation, and aid in

    forecasting or predictions.

  • 8/12/2019 Proba and Stat Lesson - Part 1

    3/25

    Statistics = is an art which involves systematic

    collection, presentation, analysis andinterpretation of data which will results to a

    meaningful facts and thereby plays an important

    roles in decision making.

    Branch of Statistics

    1. Descriptive Statistics

    2. Inferential Statistics

  • 8/12/2019 Proba and Stat Lesson - Part 1

    4/25

  • 8/12/2019 Proba and Stat Lesson - Part 1

    5/25

    Branch of Statistics

    Descriptive Statistics = (1) Deals with the

    presentation and collection of data. This is the

    first part any statistical analysis. (2) are a way ofsummarizing data - letting one number stand for

    a group of numbers. We can also use tables and

    graphs to summarize data.(3) It deals with

    collection of data, its presentation in variousforms, such as tables, graphs and diagrams and

    findings averages and other measures which

    would describe the data.

  • 8/12/2019 Proba and Stat Lesson - Part 1

    6/25

    Branch of Statistics

    Example: Industrial Statistics, Business

    Statistics, Housing and Population Statistics,

    Stocks, Trade Statistics.

  • 8/12/2019 Proba and Stat Lesson - Part 1

    7/25

    Branch of Statistics

    Inferential Statistics = (1) it deals with

    techniques used for analysis of data, making the

    estimates and drawing conclusions from limitedinformation taken on sample basis and testing

    the reliability of the estimates.. (2) involves

    drawing the right conclusions from the

    statistical analysis that has been performedusing descriptive statistics.

  • 8/12/2019 Proba and Stat Lesson - Part 1

    8/25

    Branch of Statistics

    Example: Suppose we want to investigate the

    effectiveness of certain medicine to cure an

    illness, we take a sample from the population andconduct an experiment to two groups of

    respondents and compare the results. This will

    provide inferences about the population

    proportion. This study belongs to inferentialstatistics.

  • 8/12/2019 Proba and Stat Lesson - Part 1

    9/25

    Measure of Central Tendency

    A measure of central tendency is (1). a statistic which

    describe a set of data by identifying the central position

    within that set of data. As such, measures of central

    tendency are sometimes called measures of central

    location. (2). To provide an index to describe a group or

    the difference between groups

    Mean Not applicable to Nominal Data

    Sample Mean(x-bar) Population Mean Mu

  • 8/12/2019 Proba and Stat Lesson - Part 1

    10/25

    Median Middlemost distribution of data. The point in a

    distributin of measures below which 50 percent of thecases lie and that the other 50 percent lie above this point

    Odd Number

    1

    2

    xmdn

    ith x

    1 98

    2 90

    mdn 3 864 84

    5 81

    5 13

    2mdn rd

  • 8/12/2019 Proba and Stat Lesson - Part 1

    11/25

    Median Middlemost distribution of data. The point in a

    distributin of measures below which 50 percent of thecases lie and that the other 50 percent lie above this point

    Even Number2

    ;

    2 2

    x xmdn mdn

    ith x

    1 98

    2 90

    mdn 3 86

    mdn 4 84

    5 81

    6 77

    63

    2 2

    2 6 24

    2 2

    86 8485

    2

    xmdn rd

    xmdn th

    Mdn

  • 8/12/2019 Proba and Stat Lesson - Part 1

    12/25

    Measure of Central Tendency

    Mode is the value that occurs more frequently. It is

    possible to have more than one mode, if there are two

    modes the data is said to be bimodal. It is also possible for

    a set of data to not have any mode, this situation occurs if

    the number of modes gets to be too large". It is not

    really possible to define too large" but one should

    exercise good judgment. A reasonable, though very

    generous, rule of thumb is that if the number of data

    points accounted for in the list of modes is half or moreof the data points, then there is no mode.

  • 8/12/2019 Proba and Stat Lesson - Part 1

    13/25

    Measure of Central Tendency

    When to use the Measures of Central Tendency

    Interval /

    Ratio Data

    Ordinal /

    Nominal Data

    Mean

    Median

    Mode

  • 8/12/2019 Proba and Stat Lesson - Part 1

    14/25

    Skewness

    Skewness We often test whether our data is normally

    distributed because this is a common assumption

    underlying many statistical tests. Some distributions of

    data, such as the bell curve are symmetric. This means

    that the right and the left are perfect mirror images of one

    another. But not every distribution of data is symmetric.

    Sets of data that are not symmetric are said to be

    asymmetric. The measure of how asymmetric a

    distribution can be is called skewness. As we will see, datacan be skewed either to the right or to the left.

    .

  • 8/12/2019 Proba and Stat Lesson - Part 1

    15/25

    Data that are skewed to the right have a long tail that

    extends to the right. An alternate way of talking about a

    data set skewed to the right is to say that it is positively

    skewed. In this situation the mean and the median are

    both greater than the mode. As a general rule, most of the

    time for data skewed to the right, the mean will be greaterthan the median. In summary, for a data set skewed to the

    right:

    .

    .

  • 8/12/2019 Proba and Stat Lesson - Part 1

    16/25

    In fact, in any symmetrical distribution the mean, median

    and mode are equal. However, in this situation, the mean is

    widely preferred as the best measure of central tendency

    because it is the measure that includes all the values in the

    data set for its calculation, and any change in any of the

    scores will affect the value of the mean. This is not thecase with the median or mode.

    .

    .

  • 8/12/2019 Proba and Stat Lesson - Part 1

    17/25

    we find that the mean is being dragged in the direct of the

    skew. In these situations, the median is generally

    considered to be the best representative of the central

    location of the data. The more skewed the distribution,

    the greater the difference between the median and mean,

    and the greater emphasis should be placed on using themedian as opposed to the mean. A classic example of the

    above right-skewed distribution is income (salary), where

    higher-earners provide a false representation of the

    typical income if expressed as a mean and not a median..

    .

  • 8/12/2019 Proba and Stat Lesson - Part 1

    18/25

    If dealing with a normal distribution, and tests of

    normality show that the data is non-normal, it is

    customary to use the median instead of the mean.

    However, this is more a rule of thumb than a strict

    guideline. Sometimes, researchers wish to report the mean

    of a skewed distribution if the median and mean are notappreciably different (a subjective assessment), and if it

    allows easier comparisons to previous research to be made.

    .

  • 8/12/2019 Proba and Stat Lesson - Part 1

    19/25

    Skewed to the right

    Always: mode < mean

    Always: mode < medianMost of the time: mode < median < mean

    .

  • 8/12/2019 Proba and Stat Lesson - Part 1

    20/25

    SKEWE TO THE LEFT

    Always: mean < mode

    Always: median < mode

    Most of the time: mean < median < mode

  • 8/12/2019 Proba and Stat Lesson - Part 1

    21/25

    1. Rule One. If the mean is less than the median, thedata are skewed to the left.

    2. Rule Two. If the mean is greater than the median,

    the data are skewed to the right.

  • 8/12/2019 Proba and Stat Lesson - Part 1

    22/25

    Kurtosis

    Kurtosis Kurtosis is the measure of the peak of a

    distribution, and indicates how high the distribution is

    around the mean. The kurtosis of a distributions is in one

    of three categories of classification.

    1. Mesokurtic

    2. Leptokurtic

    3. Platykurtic

    .

  • 8/12/2019 Proba and Stat Lesson - Part 1

    23/25

    Kurtosis

    1. Mesokurtic = Kurtosis is typically measured with respect to thenormal distribution. A distribution that is peaked in the same way as

    any normal distribution, not just the standard normal distribution, is

    said to be mesokurtic. The peak of a mesokurtic distribution is

    neither high nor low, rather it is considered to be a baseline for the

    two other classifications.

    2. Leptokurtic = A leptokurtic distribution is one that has kurtosis

    greater than a mesokurtic distribution. Leptokurtic distributions are

    identified by peaks that are thin and tall. The tails of these

    distributions, to both the right and the left, are thick and heavy.

    Leptokurtic distributions are named by the prefix "lepto" meaning

    "skinny."

    .

  • 8/12/2019 Proba and Stat Lesson - Part 1

    24/25

    Kurtosis

    1. Platikurtic = distributions are those that have a peak lower than amesokurtic distribution. Platykurtic distributions are characterized

    by a certain flatness to the peak, and have slender tails. The name of

    these types of distributions come from the meaning of the prefix

    "platy" meaning "broad.

    .

  • 8/12/2019 Proba and Stat Lesson - Part 1

    25/25

    SKEWE TO THE LEFTThe situation reverses itself when we deal with data

    skewed to the left. Data that are skewed to the left

    have a long tail that extends to the left. An alternate

    way of talking about a data set skewed to the left isto say that it is negatively skewed. In this situation

    the mean and the median are both less than the

    mode. As a general rule, most of the time for data

    skewed to the left, the mean will be less than the

    median. In summary, for a data set skewed to the

    left: