Upload
m-s-sridhar
View
513
Download
1
Embed Size (px)
DESCRIPTION
Researchers' Corner in J-gate newsletter 4(7) July 2012. http://informindia.co.in/Jgatenewsletter-current.html
Citation preview
1
Researchers’ Corner
Four Steps to Tabular Presentation of Data
Twice in the past, I mentioned about four steps to tabular presentation and here they
are for the consumption of novice researchers who are not much exposed to basic
statistics. To recapitulate, March 2012 issue elaborated the preparatory work for
tabulation like tally marking so that a frequency table that displays data in a concise and
logical order with one-way, two-way, or three-way classification depending upon the
number of characteristics involved can be made. Note that the raw data itself can be
classified broadly in four ways: qualitative, quantitative, temporal and spatial (see box
for their definitions). Classification, by organizing similar things into groups or classes,
brings order in the data and the
classified data can be easily
subjected to further statistical
analysis. Mutually exclusive but
exhaustive classes (or groups) are
created while tabulating based on
common characteristics. We use
attributes (statistics of attributes)
for qualitative data and class
intervals, class limits, magnitude
and frequencies for quantitative
data (statistics of variables). Four
steps presented here refer to
quantitative data only.
1. Decide the number of classes: First, try to know the range and variations in the
values of variables. Range is the difference between the largest and the smallest
value of the variable. (It is also the sum of all class intervals or the number of classes
i) Qualitative classification is based on qualitative
characteristics like status, nationality, religion,
marital status and gender.
ii) Quantitative classification is based on
characteristics measured quantitatively like age,
height, income, etc. Quantitative variables can also
be continuous or discrete. Continuous can take any
numerical value like that of weight, height, etc.
Discrete can take only certain values by a finite
‘jumps’ like number of books. It ‘jumps’ from one
value to another but does not take any intermediate
value between them. For example, we can have 71.5
Kg as weight of a person, but we cannot have 2.5
persons.
iii) Temporal (or chronological) classification involves
using time like hours, days, weeks, months or years
as classifying variable (when it is in terms of years it
is called time series).
iv) Spatial classification is based on place as a
classifying variable like village, town, block, district,
state or country.
Volume 4 Issue 7 July 2012 Newsletter
2
multiplied by class interval). In the sample Table 2 of March issue (see table) we had
the price of elementary textbooks ranging from, say 4 to 99 and hence had a range of
95. It was decided to have 10 classes of each with size or class interval of 10.
2. Decide the size of each class: This decision is inter-linked with the previous, i.e.,
with the number of classes. The thumb rule is to have 5 to 15 classes. The
mathematical way to work out size of class is given by the formula i = R / 1+3.3 log N ,
where i is the size of class interval, R is Range, N is Number of items to be grouped.
In the above referred table, it is already mentioned that, we have chosen a size of 10
for each class.
3. Determine the class limits: Choose a value less than the minimum value of the
variable as the lower class limit of the first class and a value greater than the
maximum value of the variable as the upper class limit for the last class. In the
example, we have chosen 1 as the lower class limit of the first class and 100 as the
upper class limit for the last class. It is important to choose class limit in such a way
that mid-point or class mark of each class coincides, as far as possible, with any value
around which the data tend to be concentrated. That is the class limits are chosen in
such a way that midpoint is close to average. Once the class limits are chosen, we
have the class interval. In other words, class intervals become the various intervals
of the variable chosen for classifying data. In the example we have chosen equal
3
class interval for all the 10
classes. See diagram showing
the way midpoints of even and
odd class-intervals are
determined. Further, the class
intervals could be either
exclusive or inclusive (see
text box for further explanation).
4. Find the frequency of each class: Find how many times that a certain
observation occurs in the raw data to place in a suitable class as per tally marking
(see March 2012 issue).
Lastly, one may wonder why all these mind boggling exercises when software provides
ready-to-use table. True, much of statistical drudgery is simplified by software, but the
concepts and terms in these steps are required even to use the software. As an
exercise, try the pivot table tool of Excel to generate a frequency table with five classes.
M S Sridhar
(i) Exclusive method: When the upper class limit of one class equals the lower class limit of the next
class, it is exclusive interval. This is suitable for data from a continuous variable and while
recording frequencies the upper class limit is excluded but the lower class limit of a class is
included in the interval.
(ii) Inclusive method: If both lower and upper class limits are parts of the class interval it is inclusive
interval. If a ‘gap’ or discontinuity between the upper limit of a class and the lower limit of the
next class is found, an adjustment in class interval is done. The procedure is to divide the
difference between the upper limit of first class and lower limit of the second class by 2 and
subtract it from all lower limits and then add it to all upper class limits. This adjustment restores
continuity of data in the frequency distribution, i.e., Adjusted class mark = (Adjusted upper limit
+ Adjusted lower limit) / 2.