Chapter 1 Introduction to Statistics.pdf

Embed Size (px)

Citation preview

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    1/47

    CHAPTER 1: INTRODUCTION TO

    STATISTICS

    - Qualitative data and

    Quantitative data

    - Measures of central tendencyand dispersion

    BEKA 2453 Statistics & Numerical Methods

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    2/47

    WHAT IS STATISTIC???

    Your company has created a newdrug that may cure arthritis. Howwould you conduct a test toconfirm the drug's effectiveness?

    You want to conduct a poll onwhether your school should use itsfunding to build a new athleticcomplex or a new library. Howmany people do you have topoll? How do you ensure thatyour poll is free of bias? How doyou interpret your results?

    Statistics is the mathematical science involved in the application of quantitative

    principles to the collection, analysis, and presentation of numerical data. The

    practice of statistics utilizes data from some population in order to describe it

    meaningfully, to draw conclusions from it, and make informed decisions.

    BENG 2142 Statistics

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    3/47

    WHAT DO ENGINEERS DO???

    An engineer is someone who solves problems of interest tosociety with the efficient application of scientific principlesby:

    Refining existing products

    Designing new products or processes

    BENG 2142 Statistics

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    4/47

    THE CREATIVE PROCESS

    THE ENGINEERING PROCESS

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    5/47

    STATISTICS SUPPORTS THE CREATIVE PROCESS

    BENG 2142 Statistics

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    6/47

    BENG 2142 Statistics

    1.1 Qualitative data and Quantitative data

    1.1.1 Introduction

    Statistics - the science of collecting, organizing,

    summarizing and analyzing information in order to drawconclusions.

    Two types of statistics

    Descriptive statistics

    Inferential Statistics

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    7/47

    Descriptive statistics

    consists of organizing and summarizing theinformation collected. Descriptive statistics

    describes the information collected through

    numerical measurements, charts, graphs and

    tables.

    Inferential Statistics

    generalize results obtained from a sample to

    the population and measure their reliability.

    BENG 2142 Statistics

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    8/47

    BEKA 2453 Statistics & Numerical Methods

    1.1.2 Basic Terms

    Population - consists of all items or elements ofinterest for a particular decision or

    investigation.

    Example (all FKE students in the UTeM. )

    Sample is a certain number of elements thathave been chosen from a population. Sample is

    a subset of population.

    Example: (a list of students of 2BEKG would be

    a sample from the population of all FKE students

    in the UTeM.)

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    9/47

    POPULATION VS. SAMPLE

    BENG 2142 Statistics

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    10/47

    1.1.2 Basic Terms (cont.)

    Random sample is a sample drawn in such a

    way that each element of the population has a

    chance of being selected.

    Simple random sample implies that anyparticular sample of a specified sample size

    has the same chance of being selected as any

    other sample.

    Element / member is a specific subject orindividual about which the information is

    collected.

    BENG 2142 Statistics

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    11/47

    1.1.2 Basic Terms (cont.) Variable is a characteristic of the individual within the

    sample or population.

    Observation/Measurement is the value of a variable

    for an element

    Data set is a collection of values of one or more

    variables. Grouped data set is a collection of data which are

    grouped in classes.

    Population parameter is a descriptive measure

    computed from a population data. Sample statistic is a descriptive measure computed

    from a sample data.

    Outliers / Extreme Values are values that are very

    small or very large relative to the majority of the

    values in a data set.

    BENG 2142 Statistics

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    12/47

    Outlier

    BENG 2142 Statistics

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    13/47

    PRACTICE PROBLEM

    A random sample of 30 middle school students average 1.8 hoursspent on homework each night. It is believed that middle schoolstudents spend 2 hours each night on homework. Identify the sample,

    the population, the sample statistic, and the population parameter.

    The sample is the 30 middle school students who are randomly selected. The

    population is all middle school students.

    The sample statistic is x = 1.8 hours, and the population parameter is = 2 hours.

    Remember that sample statistics are values that represent a sample, whilepopulation parameters are values that represent a population.

    BENG 2142 Statistics

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    14/47

    1.1.3 Variables

    Qualitative variables allow for classification of

    individuals based some attribute or

    characteristics

    Example: the gender of new born babies;the marital status of people, types of cars.

    Quantitative variables provide numerical

    measures of individuals. (countable).

    Example: The weight of children; the

    numbers of cars owned.

    BENG 2142 Statistics

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    15/47

    1.1.3 Variables (cont.)

    Quantitative variables can be further classified

    into two groups:

    (a) Discrete Variables.

    finite / countable number of possible values. Example:

    The number of heads obtained by flipping

    a coin five times.

    The number of cars that arrive at KFCsdrive-through between 1.00 p.m to 2.00

    p.m.

    BENG 2142 Statistics

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    16/47

    1.1.3 Variables (cont.)

    (b) Continuous Variables.

    infinite number of possible values that are not

    countable. They are obtained by measuring;

    include fractions and decimals. Example

    Time spent studying for your first statistics

    exam.

    The height of volleyball players.

    BENG 2142 Statistics

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    17/47

    PRACTICE PROBLEM

    Determine whether thefollowing variables arequalitative or quantitative.

    1. Postal Code

    2. Salary

    3. PTPTN allowances

    4. Gender

    5. Marital Status

    Determine whether the followingvariables are discrete or continuous.

    1. Heights of 2BEKG students in FKE.

    2. Number of books that have beenborrowed by FKE students each dayfrom library.

    3. Number of 2BEKG studentsattended Statistic class everyTuesday.

    4. The time taken for 2BEKG studentsto get to class at 8oclock in themorning.

    BENG 2142 Statistics

    NG 2 42 S

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    18/47

    1.1.4 Graphical Methods

    Qualitative data can be displayed by using Bar graph

    Pie chart

    Example:

    BENG 2142 Statistics

    BENG 2142 S i i

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    19/47

    Solution:

    Bar graph: Pie chart:

    0

    50

    100

    150

    200

    250

    300

    A B C D

    Frequen

    cy

    Rating

    9%

    65%

    23%

    3%

    A

    B

    C

    D

    BENG 2142 Statistics

    BENG 2142 S i i

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    20/47

    1.1.4 Graphical Methods

    Graph grouped data can be classified to:

    Histogram

    Polygon

    Graphing for cumulative frequency distribution used

    Ogive (cumulative histograms)

    BENG 2142 Statistics

    BENG 2142 St ti ti

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    21/47

    Example:

    BENG 2142 Statistics

    BENG 2142 St ti ti

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    22/47

    Solution:

    (a) & (b)

    BENG 2142 Statistics

    BENG 2142 Statistics

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    23/47

    Solution:

    (c) Histogram Polygon

    BENG 2142 Statistics

    BENG 2142 Statistics

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    24/47

    Solution:

    (d)

    BENG 2142 Statistics

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    25/47

    PRACTICE PROBLEM

    The following scores represent the final examination grade for Statisticsubject:

    23 60 79 32 57 74 52 70 82 36 80 77 81 95 41 65 92 85 55 76 5210 64 75 78 25 80 98 81 67 41 71 83 54 64 72 88 62 74 43 60 78

    89 76 84 48 84 90 15 79 34 67 17 82 69 74 63 80 85 61

    a) Construct a frequency distribution table with the class width is 10.

    b) Determine the class boundaries and class midpoints.

    c) Calculate the relative frequencies and percentages for all classes.d) Construct a frequency histogram for the data

    e) Prepare the cumulative frequency distribution table.

    f) Construct an ogive for cumulative frequency.

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    26/47

    MARKS No. of

    Student

    Relative

    frequence

    Percentage Class

    boundaries

    No. of Student Class

    midpoints

    1-10 1 0.017 2 0.5-10.5 1 5.5

    11-20 2 0.033 3 10.5-20.5 2 15.5

    21-30 2 0.033 3 20.5-30.5 2 25.5

    31-40 3 0.050 5 30.5-40.5 3 35.5

    41-50 4 0.067 7 40.5-50.5 4 45.5

    51-60 7 0.117 12 50.5-60.5 7 55.5

    61-70 10 0.167 17 60.5-70.5 10 65.5

    71-80 16 0.267 27 70.5-80.5 16 75.5

    81-90 12 0.200 20 80.5-90.5 12 85.5

    91-100 3 0.050 5 90.5-100.5 3 95.5

    Total 60 1.000 100

    0

    2

    4

    6

    8

    10

    12

    14

    16

    18

    1-10 11-20 21-30 31-40 41-50 51-60 61-70 71-80 81-90 91-100

    No.of

    Students

    Marks

    BAR GRAPH

    HOW ABOUT

    HISTOGRAM (NO

    GAP)?

    BENG 2142 Statistics

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    27/47

    1.2 Measures of central tendency and

    dispersion

    Measures of central tendacy

    - Indicates the central point around which observations tend to

    cluster- Mean, Mode, Median

    Measures of dispersion

    - The measures that help us know about the spread of a data set

    - Range, Variance, Standard deviation

    BENG 2142 Statistics

    BENG 2142 Statistics

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    28/47

    1.2 Measures of central tendency and

    dispersion1.2.1 Numerical Measures

    Skewness:

    Measures of Central

    Tendency

    Measures of Dispersion

    Mean RangeMedian Variance

    Mode Standard deviation

    Skewness

    BENG 2142 Statistics

    BENG 2142 Statistics

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    29/47

    1.2.2 Measures of Central Tendency

    Mean is the arithmetic mean or average.

    Median of a variable is the value that lies in the

    middle of the data when arranged in ascending

    order. Mode of a variable is the most frequent observation

    of the variable that occurs in the data set.

    BENG 2142 Statistics

    BENG 2142 Statistics

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    30/47

    RELATIONSHIP BETWEEN MEAN, MEDIAN

    AND MODE

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    31/47

    MEAN

    POPULATION

    Mean

    +++ where N is the number of observations in the population

    SAMPLE

    Mean

    +++

    where n is the number of observations in the sample

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    32/47

    MEDIAN

    Steps in Computing the Median of a Data Set

    Step 1: Arrange the data in ascending order.

    Step 2: Determine the number of observations, n .

    Step 3: Determine the observation in the middle of the data set.

    If the number of observations is odd, then the median is the datavalue that is exactly in the middle of the data set. That is, the median

    is the observation that lies in the +2 position. If the number of observations is even, then the median is the mean ofthe two middle observations in the data set. That is, the median is the

    mean of the data values that lie in the2 and

    2 1 positions.

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    33/47

    MODE

    To compute the mode, tally the number of observations that occur for

    each data value. The data that occurs most often is the mode. A set of

    data can have no mode, one mode or more than one mode. If there

    is no observation that occurs with the most frequency, we say the

    data has no mode.

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    34/47

    EXAMPLE

    The following data represent the monthly phone bill for six randomly

    selected months (in RM).

    35.34 42.09 39.43 38.93 43.39 49.26

    Calculate the mean, median and mode for the monthly phone bill.

    Solutions:

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    35/47

    BENG 2142 Statistics

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    36/47

    1.2.3 Measures of Dispersion

    Range of a variable is the difference between

    the largest data value and the smallest data

    value

    Variance is based upon the difference betweeneach observation and the mean; that is, it is

    based upon the deviation about the mean

    Standard deviation tells us how closely the valuesof a data set are clustered around the mean.

    BENG 2142 Statistics

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    37/47

    POPULATION

    Mean

    2 =

    Variance

    2 1 = 2 =

    2

    Standard deviation

    1

    =

    2

    = 2

    BENG 2142 Statistics

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    38/47

    SAMPLE

    Mean

    2 =

    Variance

    2 1 1 = 2 = 2

    Standard deviation

    1 1 =

    2

    = 2

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    39/47

    EXAMPLE

    The following data represent the monthly phone bill for six randomly

    selected months (in RM).

    35.34 42.09 39.43 38.93 43.39 49.26

    Compute the range, sample variance and sample standard deviation.

    Range = Largest Data ValueSmallest Data Value.

    = 49.2635.34

    =13.92

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    40/47

    Sample Variance, s2

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    41/47

    Sample standard deviation, s

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    42/47

    PRACTICE PROBLEM

    An engineer is interested in testing the bias in a pH meter. Data are

    collected on the meter by measuring the pH of a neutral substance(pH=7.0). A sample of size 10 is taken with results given by

    7.07 7.00 7.10 6.97 7.00 7.03 7.01 7.01 6.98 7.08

    Compute the range, sample variance and sample standard deviation.

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    43/47

    MEASURES OF CENTRAL TENDENCY AND

    DISPERSION FOR GROUPED DATAMEASUREMENT POPULATION SAMPLE

    MEAN

    VARIANCE

    STANDARD

    DEVIATION

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    44/47

    EXAMPLEThe following data give the monthly expenditures (in hundred RM) on

    food for 30 households randomly selected from the households who

    incurred such expenses.

    4.57 3.95 6.95 3.80 1.50 3.99 7.84 5.05 8.00 14.75 9.33 1.055.08 7.00 9.60 18.99 9.15 11.32 4.75 9.95 3.63 1.99 1.39 13.0919.31 11.15 7.73 12.00 7.58 16.35

    Find the sample mean for the monthly expenditures on food for 30households

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    45/47

    Sample Variance Sample standard deviation

    BENG 2142 Statistics

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    46/47

    1.2.4 Reasons to sampling

    To construct the whole population would be timeconsuming.

    The cost of studying the all items in population

    may be prohibitive. The physical impossibility of checking all items in

    the population.

    The destructive nature of some tests

    The sample results are adequate

  • 7/26/2019 Chapter 1 Introduction to Statistics.pdf

    47/47

    END OF CHAPTER 1