Upload
dendenlibero
View
219
Download
0
Embed Size (px)
Citation preview
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 1/55
Organization and Presentation ofData
Chapter Two
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 2/55
Describing Data: Frequency Distributions and Grap
Presentation
GOALSWhen you have completed this chapter, you will be able to:
1. Organize data in an array and into a frequency distribution.2. Portray a frequency distribution in a histogram, frequency
polygon,and cumulative frequency polygon. 3. Present data using such graphic techniques as line charts,
bar charts, and pie charts, in order to interpret thedata being graphed.
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 3/55
• Raw data , or data that have not been summarizedin any way, are sometimes referred to as
ungrouped data. • Collected data need to be organized in such a way
as to condense the information they contain in away that will show patterns of variation clearly.Precise methods of analysis can be decided upononly when the characteristics of the data areunderstood.
• Data that have been organized in a frequencydistribution are called grouped data.
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 4/55
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 5/55
Chap 2-5
2. Stem-and-Leaf Display
• A simple way to see how the data aredistributed and where concentrations ofdata exist
METHOD: Separate the sorted data seriesinto leading digits (the stems )
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 6/55
Organizing Numerical Data:Stem and Leaf DisplayA stem-and-leaf display organizes data into grou(called stems) so that the values within each group (theleaves) branch out to the right on each row.
Stem Leaf
1 67788899
2 0012257
3 28
4 2
Age of College Stude
Day Students NightS
Age ofSurveyedCollegeStudents
Day Students
16 17 17 18 18 18
19 19 20 20 21 2222 25 27 32 38 42
Night Students
18 18 19 19 20 21
23 28 32 33 41 45
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 7/55
The raw data are the numbers of Congressional billsvetoed during the administrations of seven U.S. presidentsfrom Johnson to Clinton.
Johnson Nixon Ford Carter Reagan Bush ClintonVetoes 30 43 66 31 78 44 3In stem-and-leaf terms, we could describe these data afollows:Stem (10’s Digit) Leaf (1’s Digit)
3/018 (represents 30, 31, and 384/34 (represents 43 and 44)6/6 (represents 66)7/8 (represents 78)
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 8/55
3. Tabular Presentation- a process of condensing classified data and arranging
them systematically in rows and columns.
A. Frequency Table for Categorical VariableThe figures found in the cells of the main body are the freqand the percentagesExample 1. Distribution of Employees in Terms of Civi l Statu
Civil Status FREQUENCY PERCENTAGE
Single 725 36.25%Married 250 12.50%
Widowed 375 18.75%
Separated 650 32.50%
TOTAL 2000 100%
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 9/55
Cross Tabulations:The Contingency Table
Importance of Brand
Name
Male Female Total
More 450 300 75
Equal or Less 3300 3450 6750
Total 3750 3750 7500
A survey was conducted to study the importance of brandname to consumers as compared to a few years ago. Theresults, classified by gender, were as follows:
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 10/55
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 11/55
Frequency Table
Rating FrequencyPoor 2Below Average 3
Average 5Above Average 9Excellent 1
Total 20
Example: Marada Inn
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 12/55
B. Contingency TableA contingency table (or a two-way frequency table) is a table in which
frequencies correspond to two variables.(One variable is used tocategorize rows and a second variable is used to categorized
columns)Example 2
Smoking Habit Gender
male female Total
Smoker 100 70 170
Non-smoker 50 80 130
Total 150 150 300
Distribution of Employees in Terms of Gender and Smoking Status
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 13/55
Questions
1. How many are males in the sample?2. How many are smokers in the sample?3. How many females are non-smokers?4. What percentage of the sample smokes?5. What percentage of the smokers are males?
6. What percentage of the females are smokers?
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 14/55
C. Frequency Distribution Table- refers to the tabular arrangement (grouping) of all observations into
intervals or classes together with the count of the number ofobservations that fall in each interval or class.
Remarks:1. There is no clear and definite method of finding the number of intervals.Finding the number of intervals depend upon the size of the data. It isgenerally accepted that the number of intervals to use is from 6 to 15intervals. If fewer than 6 intervals are used, much loss of informationwill be brought about due to lumping of many observations into class. Iftoo many classes are used, it could present some irregularities in thegraphical representation because many classes or intervals may
contain small frequencies.2. There is no definite rule in choosing the starting lower limit and theclass width. The starting lower limit could be the smallest observationor any number closest to the lowest observation, or any multiple of theclass size (c).
3. In constructing the class limits, one may get one more or one less thanthe suggested number of intervals.
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 15/55
Definitions:Class interval the numbers defining a classClass limits the smallest and largest values that can fall in a
given class
Class boundaries numbers that are halfway between theupper limit of a class and the lower limit of the next class
Class size length of the class interval; computed by takingthe difference between two successive upper/lower classboundaries or class limits.
Class mark midpoint of an interval; computed by taking theaverage of the lower and upper class limits of a given classinterval
Relative frequency obtained by dividing the class frequency bythe total number of observations
Relative percentage obtained by multiplying the relativefrequency the relative frequency by 100%
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 16/55
1 . Determine the range (R) of the observations.R = highest value – lowest value
2. Determine the number of class intervals (k)Suggested Rule : k must be an integer. If the computed k is notan integer, then round it off to the next higherinteger.
3. Determine the class width (c).Suggested Rule: c must have the same number of decimal places
as the original data.4. Determine the lower limit and the upper limit of the class
intervals.5. Determine the frequency for each interval, class marks, class
boundaries, cumulative frequencies (less than and greater than
cf), and the corresponding percentage.
How to Construct a Frequency Distribution Table
(Suggested Steps)
nk
k R
c
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 17/55
Additional Columns in the FDT
1 . Class boundaries(CB ). The CB’s are obtained by taking the midpointof the gaps between classes
LCB = LL – (0.5)(one unit of measure)UCB = UL + (0.5)(one unit of measure)
2.Class marks . It is the midpoint of a class .
3. Relative Frequency(RF ). This is the frequency of a class expressed inproportion to the total number of observations.
2
UL LL xi
.n
Frequency RF
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 18/55
4. Cumulative Frequency . This is the accumulated frequencyof a class.
The <CF (less than CF). It is the total number of observationswhose values do not exceed the upper limit of the class.
The >CF( greater than CF).It Corresponds to the totalnumber of observations whose values are not less than thelower limit of the class.
5 . Relative Cumulative Frequency . This is the cumulativfrequency of a class expressed in proportion to the totalnumber of observations.
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 19/55
Example 3 The following are the length of service (in months)of a sample of 50 employees in a certain shoe factory.
Organize the data using frequency distribution table.
EXCEL function: frequency(data_array, bins_array)Press: Crtl-Shift-Enter
70 87 99 112 127
78 88 99 114 132
80 89 100 115 13280 89 106 117 132
82 92 106 119 135
84 94 107 120 136
85 95 108 123 136
86 95 109 124 14087 97 110 125 140
87 98 112 125 146
Range = 146
=
k=sqrt(50
c=R/k
=76/8= 10
k
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 20/55
Output: Frequency Distribution Table
ClassIntervals
Freq Class Marks ClassBoundaries
<CF >CF
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 21/55
Output: Frequency Distribution Table
Class Intervals Freq
70-79 2
80-89 12
90-99 8
100-109 6
110-119 7120-129 6
130-139 6
140-149 3
Total 50
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 22/55
5.1 9.1 13
5.5 9.1 13
5.9 9.3 14
6.5 9.8 14
6.8 9.9
7 107.2 10.2 15.
7.3 10.3 1
8.3 11 1
8.4 11.5 16.
8.5 11.7 16.
8.5 12.3 1
8.7 12.7 17.
8.8 13.2 17.
Example 2
A recent report showed thefollowing data for percentages ofexecutives in 42 top UScorporations suffering from drugabuse problems.Construct the frequencydistribution.
1 H
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 23/55
1.Find the highest and lowest value2.Determine the range . It is the
difference between the highest andlowest values in the data set.
3.Determine the number of classes k. Thenumber of classes is somewhatarbitrary. In general your table shouldhave been between 5 and 20 classes.A simple rule you can follow toapproximate the number of classes is
4.Determine the width of the classintervals. (Round off result to thenearest value whose precision is thesame as those of the raw data.
5.Select a starting point for the lowerclass limit. (This can be the smallestdata or any convenient number lessthan the smallest data value .
1.Hv=
2.Range=Hv-Lv
3.n=sqrt(n) =
4. w=range/kw=
w=w ≈
5. Starting pointcan be
nk
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 24/55
Output: Frequency Distribution TableClass
IntervalsFreq Class Marks Class
Boundaries<CF >CF
5.1-6.8 5 5.95 5.05-6.9-8.6 7 7.758.7-10.4 10 9.5510.5-12.2
311.35
12.3-14.0
513.15
14.1-15.8
514.95
15.9-17.6
716.75
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 25/55
SEATWORK
• The following are the average weekly mortgage interest rates for a
40-week period.
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 26/55
1. In the following stem-and-leaf display for a set oftwo- digit integers, the stem is the 10’s digit, and each leafis the 1’s digit. What is the original set of data?
2/0022783/0113594/13445/47
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 27/55
SEATWORK• The accompanying data describe the hourly wage
rates (dollars per hour) for 30 employees of an electronics firm:22.66 24.39 17.31 21.02 21.61 20.97 18.58 19.74 21.57 20.56 22.16 20.16 18.97 22.64 122.05 22.03 17.09 24.60 23.82 17.80 16.28 19
22.22 19.49 22.27 18.20 19.29 20.43Construct a frequency distribution for these data.
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 28/55
2.58 In 2007, unemployment rates in the 50 U.S. states were reported asfollows.Source: Bureau of the Census, Statistical Abstract
of the United States 2009 , p. 373.
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 29/55
a. Construct a stem-and-leaf display for these data.b. Construct a frequency distribution for these data.c. Determine the interval width and the class mark foreach of the classes in your frequency distribution.d. Based on the frequency distribution obtained in part(b), draw a histogram and a relative frequency polygonto describe the data.
In a study of reaction times to a specific stimulus an animal trainer
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 30/55
Class limits Fre2.3 – 2.9 3.0 – 3.6 3.7 – 4.3 4.4 – 5.0 5.1 – 5.7 5.8 – 6.4
In a study of reaction times to a specific stimulus, an animal trainerobtained the following data, given in seconds.
With reference to the table, determine:a. the upper limit of the fourth class.b.The class midpoint of the third classc.The class boundaries of the second classd.The size of the fifth class interval.e.The number of animals which respondto the stimulus in 3.0 to 3.6 seconds.
f.The number of animals which respond to the stimulus after 4.35 secondsg.The percentage of animals which respond to the stimulus in 5.1 to 5.7 seconds.hh.The percentage of animals which respond to the stimulus in less that 3.65seconds.
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 31/55
GRAPHICALPRESENTATION
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 32/55
GRAPHICAL PRESENTATIONa method of presenting numericalvalues or relationships in pictorial form.
1.LINE GRAPHS2.BAR GRAPHS3. PIE CHARTS
Membership Growth of
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 33/55
Membership Growth ofFICCO
Year Total Number ofmembers
1980 987
1990 9186
2000 24026
2010 140128
2012 163722
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 34/55
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 35/55
Line graphs are typically used to show thechange or trend in a variable over time.
The line graph is capable of simultaneously showingvalues of two quantitative variables ( y , or vertical aand x , or horizontal axis); it consists of linear segmentsconnecting points observed or measured for eachvariable.
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 36/55
A B Ch t i t f i f t l b h th
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 37/55
A B ar Chart - consists of a series of rectangular bars where thelength of the bar represents the magnitude to be
demonstrated.It can be used to depict any of the levels of
measurement (nominal, ordinal, interval, or ratio).
0
1
23
4
5
6
7
8
9
43-48 49-54 55-60 61-66 67-72 73-78 79-84
F
r
e
q
u
e
n
c
y
Class Intervals
Bar Chart
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 38/55
Example 2.Construct a bar chart for the number
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 39/55
of unemployed per 100,000 population forselected cities during 2001
City Number of unemployedper 100,000 population
Atlanta, GA 7300Boston, MA 5400Chicago, IL 6700
Los Angeles, CA 8900New York, NY 8200
Washington, D.C. 8900
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 40/55
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 1
Atlanta, GA
Boston, MA
Chicago, IL
Los Angeles, CA
New York, NY
Washington, D.C.
Number of unemployed per 100,000 population
A Pie Chart is useful for displaying a relative frequency
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 41/55
A Pie Chart is useful for displaying a relative frequencydistribution. A circle is divided proportionally to the
relative frequency and portions of the circle areallocated for the different groups.
1. MONTHLY BUDGET OF A COLLEGE STUDENT
EXPENSES AMOUNTLodging P2,500
Food 5,000School
Supplies
500
Pocket Money 1,000Miscellaneous 1,000
Recreation 500
PIE CHART
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 42/55
Food62%
School Supplies6%
Money13%
Miscellaneous13%
Recreation6%
Monthly Budget of s College Student
EXAMPLE
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 43/55
• A sample of 200 runnerswere asked to indicate
their favorite type ofrunning shoe. Draw a piechart based on thefollowing information.
Type ofshoe
# ofrunners
% of total
Nike 92 46.0
Adidas 49 24.5
Reebok 37 18.5
Asics 13 6.5
Other 9 4.5
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 44/55
GRAPHICAL REPRESENTATION of the
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 45/55
FREQUENCY DISTRIBUTION
• The three commonly used graphicforms are Histograms, Frequency
Polygons , and Ogives
A Histogram is a graph in which the class midpoints
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 46/55
g g p por limits are marked on the horizontal axis and theclass frequencies on the vertical axis.The class frequencies are represented by the heights
of the bars and the bars are drawn adjacent to eachother.
0
2
4
6
8
10
12
12.65 17.45 22.25 27.05 31.85
Hours spent studying
F r e q u e n c y
5.1 9.1 13
9 3
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 47/55
5.5 9.1 13
5.9 9.3 14
6.5 9.8 14
6.8 9.9
7 107.2 10.2 15.
7.3 10.3 1
8.3 11 1
8.4 11.5 16.
8.5 11.7 16.
8.5 12.3 18.7 12.7 17.
8.8 13.2 17.
Example 2
A recent report showed
the following data forpercentages ofexecutives in 42 top UScorporations sufferingfrom drug abuseproblems.Construct the frequencydistribution.
Output: Frequency Distribution Table
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 48/55
Output: Frequency Distribution TableClass
IntervalsFreq Class Marks Class
Boundaries<CF >CF
5.1-6.8 55.95
5.05-6.85 5 42
6.9-8.6 77.75
6.85-8.65 12 37
8.7-10.4 109.55
8.65-10.45 22 30
10.5-12.2 311.35
10.45-12.25 25 20
12.3-14.0 513.15
12.25-14.05 30 17
14.1-15.8 514.95
14.05-15.85 35 12
15.9-17.6 716.75
15.85-17.65 42 7
HISTOGRAM
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 49/55
5
7
10
3
5
0
2
4
6
8
10
12
5.95 7.75 9.55 11.35 13.15
NUMBER OFCORPORATIONS
CLASS MARKS
Percentage of Executives Suffering from Drug Abuse ProblemsHISTOGRAM
A Frequency Polygon consists of line
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 50/55
A Frequency Polygon consists of linesegments connecting the points formed bythe class midpoint and the class frequency.
0
2
4
6
8
10
12
4.15 5.95 7.75 9.55 11.35 13.15 14.95 16.75
NO. OF CORPORATIONS
CLASS MARKS
Percentage of Executives Suffering from Drug Abuse Problems
Ogive(Cumulative frequency polygon) A graph showing the cumulative frequency
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 51/55
A graph showing the cumulative frequencyplotted against the class boundary. The graphcan be a “less than or greater than” ogiv
0
5
12
2225
30
35
42
0
5
10
15
20
25
30
35
40
45
5.05 6.85 8.65 10.45 12.25 14.05 15.85 17.65
L
e
s
s
t
ha
n
c
f
Class Boundaries
Less Than Ogive
Greater Than Ogive
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 52/55
42
37
30
2017
12
7
00
5
10
15
20
25
30
35
40
45
5.05 6.85 8.65 10.45 12.25 14.05 15.85 17.65
g
r
e
a
t
e
r
t
h
a
n
c
f
Class Boundaries
Greater Than Ogive
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 53/55
The Abuse of Visual DisplaysRemember that visuals can be designed to be either emotionally charged or
purposely misleading to the unwary viewer. This capacity to mislead isshared by a great many statistical tests and descriptions, as well as visualdisplays. We will consider just a few of the many possible examples wheregraphical methods could be viewed as misleading.
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 54/55
8/10/2019 Organization and Presentation of Data
http://slidepdf.com/reader/full/organization-and-presentation-of-data 55/55