3
1 Researchers’ Corner Working out Percentages and Random Number Generation We have seen tally marking to prepare frequency table and various parts and features of a table in the preparation for tabular presentation (Feb 2012 issue). Before we explore four steps to tabular presentation, following two interesting tables found in two recent draft papers attracted my attention. Subject-wise distribution of books Subjects Number of Books Percentage Art & Architecture 10 1.06 Biographies 38 4.03 Generalia 9 0.95 Language& Literature 116 12.30 Mysticism 6 0.64 Religion& Philosophy 33 3.50 Sciences 46 4.88 Social sciences 685 72.64 Total 943 100 It is a common mistake in tabular presentations to work out percentages in wrong direction. In the above tables, sample books and sample students are presented with distribution by subject and branch of engineering respectively ignoring the total population. That is percentage of books digitized should have been more meaningful for a given subject in relation to total books in that subject and similarly the number of students in each branch in the sample in relation to total students in that branch is necessary. First one is a digitization study consisting of 943 books out of over 3 lakh books in the library and the second is a study of use of e-resources by engineering students with data collected through questionnaire from 150 sample students selected from a population of 2160. It is claimed that simple random method is adopted without any clue about how the size of sample was determined and the process followed for random sampling. In both cases how the sample (or response) is distributed among characteristics like ‘subject’ in first case and engineering branch in the second is not examined. Some tips relating to percentages are: Branch-Wise Distribution of Engineering Students Sl. No. Branch Students Number Percentage 1 Electronic Communication Engineering 30 20 2 Computer Science Engineering 32 21 3 Information Technology 30 20 4 Electrical and Electronic Engineering 25 17 5 Mechanical Engineering 19 13 6 Civil Engineering 14 09 TOTAL 150 100 Volume 4 Issue 5 May 2012

Percentages and Random Number Generation

Embed Size (px)

DESCRIPTION

Researchers' Corner, J-gate newsletter, vol.4, no.5, May 2012. http://informindia.co.in/Jgatenewsletter-current.html

Citation preview

Page 1: Percentages and Random Number Generation

1

Researchers’ Corner

Working out Percentages and Random Number Generation

We have seen tally marking to prepare frequency table and various parts and features of a table

in the preparation for tabular presentation (Feb 2012 issue). Before we explore four steps to

tabular presentation, following two interesting tables found in two recent draft papers attracted

my attention.

Subject-wise distribution of books

Subjects Number

of Books

Percentage

Art & Architecture 10 1.06

Biographies 38 4.03

Generalia 9 0.95

Language& Literature 116 12.30

Mysticism 6 0.64

Religion& Philosophy 33 3.50

Sciences 46 4.88

Social sciences 685 72.64

Total 943 100

It is a common mistake in tabular presentations to work out percentages in wrong direction. In

the above tables, sample books and sample students are presented with distribution by subject

and branch of engineering respectively ignoring the total population. That is percentage of

books digitized should have been more meaningful for a given subject in relation to total books

in that subject and similarly the number of students in each branch in the sample in relation to

total students in that branch is necessary. First one is a digitization study consisting of 943

books out of over 3 lakh books in the library and the second is a study of use of e-resources by

engineering students with data collected through questionnaire from 150 sample students

selected from a population of 2160. It is claimed that simple random method is adopted without

any clue about how the size of sample was determined and the process followed for random

sampling. In both cases how the sample (or response) is distributed among characteristics like

‘subject’ in first case and engineering branch in the second is not examined. Some tips relating

to percentages are:

Branch-Wise Distribution of Engineering Students

Sl.

No. Branch

Students

Number Percentage

1

Electronic Communication

Engineering 30 20

2

Computer Science

Engineering 32 21

3 Information Technology 30 20

4

Electrical and Electronic

Engineering 25 17

5 Mechanical Engineering 19 13

6 Civil Engineering 14 09

TOTAL 150 100

Volume 4 Issue 5 May 2012

Page 2: Percentages and Random Number Generation

2

• Percentages including ratios and & proportion should be computed in the direction of causal

factor, if any

• Percentage should run only in the direction in which a sample is representative

• Do not average percentages ( without weighing by the size of samples)

• Do not use very large percentages (e.g. 1200% increase)

• Do not use too small a base (e.g. 33 1/3% for 1 in 3)

Incidentally, size of sample should be

Adequate to provide an estimate with sufficiently high precision

Representative to mirror the various patterns and sub-classes of the population

Neither too large nor too small, but optimum to meet efficiency (cost), reliability (precision) &

flexibility

Higher the precision and larger the variance, the larger the size and more the cost

The essence of Simple Random Sampling (SRS) is the non-zero equal probability of every

unit in the population to get selected, i.e., the probability of an unit getting selected in the

population N is 1/N [this is with replacement and the same without replacement is 1/N-1)].

However if we have to select n units (sample size) from a finite population of N, the probability

of every unit getting selected is n!/(N-1)!. For example, if N=5 and n=2, then n!/(N-1)! = 1/12. A

simple random sample is usually selected by without replacement. Often the phrase ‘Random

Sample’ and ‘Simple Random Sample’ are wrongly used interchangeably. As mentioned above,

in SRS each unit of the population has non-zero equal probability of being selected, where as in

‘Random sample’, it may have a known (equal or un-equal) probability of selection.

The selection process for finite population could be one of the following:

1. Lottery method (blind folded or using rotating drum) is an old classical method. All the units in

the population are numbered from 1 to N (and it is called sampling frame), written on the

small slips of paper, thoroughly mixed in the drum before picking blind folded. This method is

used when size of the population is small.

2. Random number table (like Tippetts numbers) is used for larger population as it is difficult to

mix the slips properly in lottery method. For example, one take two-digit numbers from the

table of random numbers if the population is up to 100 starting from any column or row of the

table. Of course any number above 99 will be ignored and if any number is repeated, it is not

considered in ‘sampling without replacement’. For example to select 10 items from a

population consisting of 150 items,

Number the population from 1 to 900 (the highest multiple of 150 less than 1000)

Select a starting position from the random table

Page 3: Percentages and Random Number Generation

3

Continue to choose numbers between 1 and 900 which has not already been selected till

you reach 10

Both lottery method and random number table method can be cumbersome, particularly for

large sample sizes.

3. Computer generated random numbers can be generated from free sources like StatTrek's

Random Number Generator (http://stattrek.com/statistics/random-number-generator.aspx) or

Random Integer Generator (http://www.random.org/integers/ ). Just answer online the

questions like how many random numbers, Minimum and Maximum value, whether to allow

duplicates, optional seed number you will have the SRS numbers in seconds.

The above example (under 2 above, Random number table) of choosing 10 samples from a

population of 150 in Random Integer generator gave the result: ‘Here are your random

numbers’

7 14 82 80 87

109 46 35 73 134

There are other methods of selection processes like Grid system for selecting a sample of an

area. Note that SRS is the basic selection process and all other complex random sampling

procedures are built on SRS.

M S Sridhar

[email protected]