Statistics - Describing Data I

Embed Size (px)

Citation preview

  • 8/6/2019 Statistics - Describing Data I

    1/26

    2- 1

    Lecture 2: Describing Data ILecture 2: Describing Data I

    GOALS

    ONE

    Organize data into a frequency distribution.TWO

    Portray a frequency distribution in a histogram, frequency polygon, and

    cumulative frequency polygon.

    THREE

    Develop a stem-and-leaf display.

    FOUR

    Present data using such graphic techniques as line charts, bar charts,

    and pie charts.

  • 8/6/2019 Statistics - Describing Data I

    2/26

    2- 2

    EXAMPLE 1EXAMPLE 1

    Dr. Jame is Dean of the School of Business NationalUniversity. He wishes prepare to a report showingthe number of hours per week students spendstudying. He selects a random sample of 30 students

    and determines the number of hours each studentstudied last week.

    15.0, 23.7, 19.7, 15.4, 18.3, 23.0, 14.2, 20.8, 13.5,

    20.7, 17.4, 18.6, 12.9, 20.3, 13.7, 21.4, 18.3, 29.8,17.1, 18.9, 10.3, 26.1, 15.7, 14.0, 17.8, 33.8, 23.2,12.9, 27.1, 16.6.

  • 8/6/2019 Statistics - Describing Data I

    3/26

    2- 3

    Frequency DistributionFrequency Distribution

    AFrequency distribution is a grouping of data

    into mutually exclusive categories showing the

    number of observations in each class.

  • 8/6/2019 Statistics - Describing Data I

    4/26

    2- 4

    Example 1Example 1 continuedcontinued

    Estimate the number of classes.

    There are 30 observations. 2>30.We should have at

    least 5 classes.

    Find Range (R ) to determine class width

    The range is 23.5 hours. Choose an interval of 5

    hours.

    Set the first lower limit.The lower limit of the first class is 7.5 hours.

    Count the number of values in each class and fill in

    the table.

  • 8/6/2019 Statistics - Describing Data I

    5/26

    2- 5

    ExampleExample 11 continuedcontinued

    Hours Frequency (f)

    7.5 up to 12.5 1

    12.5 up to 17.5 12

    17.5 up to 22.5 10

    22.5 up to 27.5 5

    27.5 up to 32.5 1

    32.5 up to 37.5 1

    Total 30

  • 8/6/2019 Statistics - Describing Data I

    6/26

    2- 6

    Frequency Distribution TerminologyFrequency Distribution Terminology

    Class midpoint:

    A point that divides a class into two equal parts. This isthe average of the upper and lower class limits.

    Class frequency:

    The number of observations in each class.

    Class interval:

    The class interval is obtained by subtracting the lowerlimit of a class from the lower limit of the next class.

  • 8/6/2019 Statistics - Describing Data I

    7/26

    2- 7

    ExampleExample 11 continuedcontinued

    A relative frequency distribution shows the

    proportion of observations in each class.

  • 8/6/2019 Statistics - Describing Data I

    8/26

    2- 8

    Relative Frequency DistributionRelative Frequency Distribution

    Hours Frequency (f) Relative Frequency

    7.5 up to 12.5 1 1/30 = 0.0333

    12.5 up to 17.5 12 12/30 = 0.4000

    17.5 up to 22.5 10 10/30 = 0.3330

    22.5 up to 27.5 5 5/30 = 0.1667

    27.5 up to 32.5 1 1/30 = 0.0333

    32.5 up to 37.5 1 1/30 = 0.0333

    Total 30 30/30 = 1.0000

  • 8/6/2019 Statistics - Describing Data I

    9/26

    2- 9

    EXAMPLEEXAMPLE 22

    Colin achieved the following scores on his twelve

    accounting quizzes this semester:

    86, 79, 92, 84, 69, 88, 91, 83, 96, 78, 82, 85.

    Organize this data to show its distribution.

    69, 78, 79, 82, 83, 84, 85, 86, 88, 91, 92, 96

    12 data, so recommend at least 4 classes

    Range = 96-69 = 27. Class width = 7

    66-73, 73-80, 80- 87, 87-94, 94-101

  • 8/6/2019 Statistics - Describing Data I

    10/26

    2- 10

    StemStem--andand--leaf Displaysleaf Displays

    Stem-and-leaf display: A statistical techniquefor displaying a set of data. Each numericalvalue is divided into two parts: the leading

    digits become the stem and the trailing digitsthe leaf.

    Note: an advantage of the stem-and-leaf

    display over a frequency distribution is we donot lose the identity of each observation.

  • 8/6/2019 Statistics - Describing Data I

    11/26

    2- 11

    stem lea

    6 9

    7 8 9

    8 2 3 4 5 6 8

    9 1 2 6

    ExampleExample 22 continuedcontinued

  • 8/6/2019 Statistics - Describing Data I

    12/26

    2- 12

    Graphic Presentation of a FrequencyGraphic Presentation of a Frequency

    DistributionDistributionThe three commonly used graphic forms are histograms,

    frequency polygons, and a cumulative frequency

    distribution.

    A Histogram is a graph in which the classes aremarked on the horizontal axis and the class

    frequencies on the vertical axis.

    The class frequencies are represented by the

    heights of the bars and the bars are drawn adjacent

    to each other.

  • 8/6/2019 Statistics - Describing Data I

    13/26

    2- 13

    Graphic Presentation of a FrequencyGraphic Presentation of a Frequency

    DistributionDistribution

    A frequency polygon consists of line segments

    connecting the points formed by the class midpoint

    and the class frequency.

    A cumulative frequency distribution is used to

    determine how many or what proportion of the datavalues are below or above a certain value.

  • 8/6/2019 Statistics - Describing Data I

    14/26

    2- 14

    Histogram for Hours Spent StudyingHistogram for Hours Spent Studying

    0

    2

    4

    6

    8

    10

    12

    14

    10 15 20 25 30 35

    Hours spent studying

    requenc

    y

  • 8/6/2019 Statistics - Describing Data I

    15/26

    2- 15

    Frequency Polygon for Hours SpentFrequency Polygon for Hours Spent

    StudyingStudying

    0

    2

    4

    6

    810

    12

    14

    10 15 20 25 30 35

    H r t t y

    Fr

    y

  • 8/6/2019 Statistics - Describing Data I

    16/26

    2- 16

    Cumulative Frequency Distribution forCumulative Frequency Distribution forHours StudyingHours Studying

    0

    5

    10

    15

    20

    25

    30

    35

    10 15 20 25 30 35

    Hours e t tudyi

    re ue cy

  • 8/6/2019 Statistics - Describing Data I

    17/26

    2- 17

    OtherGraphic Presentations of DataOtherGraphic Presentations of Data

    Line chart is useful for showing the trends of the

    data over time.

    Bar Chart is useful for displaying the difference

    between group of data.

    Pie chart is useful for displaying a relative

    frequency distribution among group of data.

  • 8/6/2019 Statistics - Describing Data I

    18/26

    2- 18

    ExampleExample 33

    Construct a graphical presentation for the number

    of unemployed per 100,000 population for

    selected cities during 2001.

  • 8/6/2019 Statistics - Describing Data I

    19/26

    2- 19

    ExampleExample33

    Cities Number of Unemployed

    per 100, 000 population

    Atlanta, GA 7,300

    Boston,MA 5,400

    Chicago, IL 6,700

    LosAngeles, CA 8,900

    New York, NY 8,200

    Washington, D.C. 8,900

  • 8/6/2019 Statistics - Describing Data I

    20/26

    2- 20

    Bar Chart for the Unemployment DataBar Chart for the Unemployment Data

    7300

    5400

    6700

    89008200

    8900

    0

    2000

    4000

    6000

    8000

    10000

    1 2 3 4 5 6

    Cities

    #unemployed/

    100,0

    00

    Atl taBoston

    Chi ago

    Los Angeles

    New York

    Wash

    ington

  • 8/6/2019 Statistics - Describing Data I

    21/26

    2- 21

    EXAMPLEEXAMPLE 44

    A sample of 200 runners were asked to indicate

    their favorite type of running shoe. Draw a

    graphical presentation for this data.

    Types of Shoe The number of runners

    Nike 92

    Adidas 49

    Reebok 37

    Asics 13

    Other 9

  • 8/6/2019 Statistics - Describing Data I

    22/26

    2- 22

    Pie Chart for Running ShoesPie Chart for Running Shoes

    Nik

    A i as

    okAsi s

    Other

    Nike

    A i as

    Reebok

    Asi s

    Other

  • 8/6/2019 Statistics - Describing Data I

    23/26

    2- 23

    ExercisesExercises

    AIN is a leader the in logistic business. The following data is its

    annual report for primary net income per common share for years1999 to 2004.

    1999 2000 2001 2002 2003

    $0.50 $0.62 $1.03 $1.37 $1.34

    What kind of graphical tool should be used to present this data?

    Line Graph

  • 8/6/2019 Statistics - Describing Data I

    24/26

    2- 24

    ExercisesExercises

    The following information report the companys consumer sales (inmillions) by region.

    Region Sales

    Americas 574.50Europe 486.70

    Asian/Pacific 86.10

    What kind of graphical tool should be used to present this data?

    Bar Chart or Pie Chart

  • 8/6/2019 Statistics - Describing Data I

    25/26

    2- 25

    ExercisesExercises

    Based on the previous exercise, which graphical tool between bar

    chart and pie chart is better describes the relative proportion of thetotal sales?

    Pie Chart

  • 8/6/2019 Statistics - Describing Data I

    26/26

    2- 26

    HomeworkHomework

    Chapter 2:

    Problems: 10, 12, 14, 27, 28, and 48

    Chapter 4:

    Problems: 8