Session 1.pdf

Embed Size (px)

Citation preview

  • 7/24/2019 Session 1.pdf

    1/50

    Advanced Business Statistics

    Introduction and Descriptive Statistics

    Session 1

  • 7/24/2019 Session 1.pdf

    2/50

    Itis better to be roughly right than

    precisely wrong.

    John Maynard Keynes

  • 7/24/2019 Session 1.pdf

    3/50

    Information resulting from a good statistical analysis isalways concise, often precise, and never useless!

  • 7/24/2019 Session 1.pdf

    4/50

    Statistics teach us how to summarize data, analyze

    them, and draw meaningful inferences that then lead to

    improved decisions. These better decisions we make

    help us improve the running of a department, a company,

    or the entire economy.

  • 7/24/2019 Session 1.pdf

    5/50

    Scope of Statistics

    HR Problems

    Retention rate

    Staffing

    % of Increment/year/ six months

    Factors influencing productivity

    Operations Problems

    System studyUtilization of staff

    Minimizing idle time of machines

    Quality/Production Problems

    Processing time Volume of

    business per day

    Meeting clients requirements:

    Volume, Precision, TimeError report analysis: QA & QC

    Inspection plans

  • 7/24/2019 Session 1.pdf

    6/50

  • 7/24/2019 Session 1.pdf

    7/50

  • 7/24/2019 Session 1.pdf

    8/50

    Why Statistics?

  • 7/24/2019 Session 1.pdf

    9/50

  • 7/24/2019 Session 1.pdf

    10/50

    A quantitative variablecan be described by a number for

    which arithmetic operations such as averaging make

    sense.

  • 7/24/2019 Session 1.pdf

    11/50

    A qualitative (or categorical) variable simply records a

    quality. If a number is used for distinguishing members of

    different categories of a qualitative variable, the number

    assignment is arbitrary.

  • 7/24/2019 Session 1.pdf

    12/50

    Quantitative or Qualitative?

  • 7/24/2019 Session 1.pdf

    13/50

    Types of Scales

    Nominal Ordinal

    Interval Ratio

  • 7/24/2019 Session 1.pdf

    14/50

    Nominal Ordinal

    Interval Ratio

  • 7/24/2019 Session 1.pdf

    15/50

    Nominal Ordinal

    Interval Ratio

  • 7/24/2019 Session 1.pdf

    16/50

    Nominal Ordinal

    Interval Ratio

  • 7/24/2019 Session 1.pdf

    17/50

    Nominal Ordinal

    Interval Ratio

  • 7/24/2019 Session 1.pdf

    18/50

    In general, the interval between two interval scale

    measurements will be in ratio scale.

  • 7/24/2019 Session 1.pdf

    19/50

    19

    Binary Orderedcategories

    Count

    Classifiedinto one of

    two categories

    Rankingsor ratings

    Counteddiscretely

    Measuredon a continuous

    scale

    Votingfor / against

    a move

    Training feedbackon a 5 point scale

    Number oferrors in aninstruction

    Time (inhours) to

    process aninstruction

    Description

    Example

    Discrete Continuous

    Continuumof

    Data Types

    Flow Chart for Data Types

  • 7/24/2019 Session 1.pdf

    20/50

    The populationconsists of the set of all measurements in

    which the investigator is interested. The population is also

    called the universe. A sample is a subset of

    measurements selected from the population.

  • 7/24/2019 Session 1.pdf

    21/50

  • 7/24/2019 Session 1.pdf

    22/50

    Population vs. Sample?

  • 7/24/2019 Session 1.pdf

    23/50

    A set of measurements obtained on some variable is

    called a data set.

  • 7/24/2019 Session 1.pdf

    24/50

    Class Exercise

    A survey by an electric company contains questions on the following:

    1. Age of household head.

    2. Sex of household head.

    3. Number of people in household.

    4. Use of electric heating (yes or no).

    5. Number of large appliances used daily.6. Thermostat setting in winter.

    7. Average number of hours heating is on.

    8. Average number of heating days.

    9. Household income.

    10. Average monthly electric bill.

    11. Ranking of this electric company as compared with two previouselectricity suppliers.

    Describe the variables implicit in these 11 items as quantitative or

    qualitative, and describe the scales of measurement.

  • 7/24/2019 Session 1.pdf

    25/50

    25

    Population size = N

    Population mean =

    Standard deviation =Sample size = n

    Mean = x

    Standard deviation= s

    s

    Some Symbols

  • 7/24/2019 Session 1.pdf

    26/50

    The Pth percentile of a group of numbers is that value

    below which lie P% (P percent) of the numbers in the

    group. The position of the Pth percentile is given by:

    (n + 1)P/100

    , where n is the number of data points.

  • 7/24/2019 Session 1.pdf

    27/50

    First quartile (25thpercentile) >> Lower quartile

    Median (50thpercentile) >> Middle quartile

    Third quartile (75thpercentile) >> Upper quartile

  • 7/24/2019 Session 1.pdf

    28/50

    We define the interquartile range as the differencebetween the first and third quartiles.

  • 7/24/2019 Session 1.pdf

    29/50

    Class Exercise

    The following data are numbers of passengers on flights of Delta

    Air Lines between San Francisco and Seattle over 33 days in

    April and early May.

    128, 121, 134, 136, 136, 118, 123, 109, 120, 116, 125, 128, 121,

    129, 130, 131, 127, 119, 114, 134, 110, 136, 134, 125, 128, 123,128, 133, 132, 136, 134, 129, 132

    Find the lower, middle, and upper quartiles of this data set.

    Also find the 10th, 15th, and 65th percentiles. What is theinterquartile range?

  • 7/24/2019 Session 1.pdf

    30/50

    Measures of Central Tendency

    The median

    The mode of the data set is the value that occurs

    most frequently.

    The meanof a set of observations is their average.

  • 7/24/2019 Session 1.pdf

    31/50

  • 7/24/2019 Session 1.pdf

    32/50

  • 7/24/2019 Session 1.pdf

    33/50

    A Symmetrically Distributed Data Set

  • 7/24/2019 Session 1.pdf

    34/50

    Measures of Variability

    the interquartile range

    The rangeof a set of observations is the difference

    between the largest observation and the smallest

    observation.

  • 7/24/2019 Session 1.pdf

    35/50

    Measures of Variability

    The variance of a set of observations is the

    average squared deviation of the data points from

    their mean.

    The standard deviation of a set of observations isthe (positive) square root of the variance of the set.

  • 7/24/2019 Session 1.pdf

    36/50

    Shortcut Formula for the Sample Variance

    In financial analysis, the standard deviation is often

    used as a measure of volatility and of the risk

    associated with financial variables.

  • 7/24/2019 Session 1.pdf

    37/50

  • 7/24/2019 Session 1.pdf

    38/50

    Skewness and Kurtosis

  • 7/24/2019 Session 1.pdf

    39/50

    Skewness and Kurtosis

    Relative kurtosis = Absolute kurtosis - 3

  • 7/24/2019 Session 1.pdf

    40/50

    ChebyshevsTheorem

    A mathematical theorem called Chebyshevs theorem

    establishes the following rules:

    1. At least three-quarters of the observations in a set

    will lie within 2 standard deviations of the mean.

    2. At least eight-ninths of the observations in a set will

    lie within 3 standard deviations of the mean.

    In general, the rule states that at least of the

    observations will lie within k standard deviations of

    the mean.

  • 7/24/2019 Session 1.pdf

    41/50

    Methods of Displaying Data

    23%

    14%

    14%11%

    9%

    8%

    4%

    5%

    3%

    3%

    3%3%

    0%

    0%

    0%

    Semester 1, 2015/2016

    OB, Operation and HRM Innovation and Technology Management

    Marketing Consumer and Customer-related Studies

    Accounting Supply Chain Management

    Service Quality and Customer Satisfaction Tourism

    Corporate Governance and Ethics Finance and Economics

    Social Responsibility and Sustainabil ity Strategy

    Entrepreneurship / Social Entrepreneurship International BusinessLeadership / Ethical Leadership

    Pie Chart

  • 7/24/2019 Session 1.pdf

    42/50

    23%

    37%

    51%

    62%

    71%

    78%83%

    88%91% 94%

    97% 100%

    0%

    10%

    20%

    30%

    40%

    50%

    60%

    70%

    80%

    90%

    100%

    0

    2

    4

    6

    8

    10

    12

    14

    16

    Semester 1, 2015/2016

    No. Cumulative %

    Bar Chart and Ogive

    An ogiveis a cumulative-frequency (or cumulative relative-frequency) graph. An ogive starts

    at 0 and goes to 1.00 (for a relative-frequency ogive) or to the maximum cumulative

    frequency.

  • 7/24/2019 Session 1.pdf

    43/50

    Stem and Leaf Box Plot

  • 7/24/2019 Session 1.pdf

    44/50

  • 7/24/2019 Session 1.pdf

    45/50

    Outlier

  • 7/24/2019 Session 1.pdf

    46/50

    Outlier is an observation point that is distant from

    other observations.

  • 7/24/2019 Session 1.pdf

    47/50

    Dont misuse statistics!

  • 7/24/2019 Session 1.pdf

    48/50

  • 7/24/2019 Session 1.pdf

    49/50

    Class Exercise

    Heart Rate

  • 7/24/2019 Session 1.pdf

    50/50

    Thank you

    Dr. Mehran Nejati

    [email protected]