8
A Genetic Algorithm for Generating Test From a Question Bank MEHMET YILDIRIM Electronics and Computer Education Department, University of Kocaeli, 41380 Kocaeli, Turkey Received 7 April 2008; accepted 12 June 2008 ABSTRACT: The purpose of this study is to provide academicians with efficient means of generating tests with multiple-choice questions from a question bank. Genetic algorithm (GA) is used to optimize predefined criteria for selecting questions from the question bank. GA is a very useful optimization algorithm because of its versatility. However, crossover and mutation operator of standard GA cannot be directly usable for generating test, since integer-coded individuals have to be used and these operators produce duplicated genoms on individuals. In this study, a mutation operation is proposed for preventing the duplications on crossovered individuals. The experiments and analysis show that GA with proposed mutation operator is successful as approximately 100%. ß 2009 Wiley Periodicals, Inc. Comput Appl Eng Educ 18: 298305, 2010; Published online in Wiley InterScience (www.interscience.wiley.com); DOI 10.1002/cae.20260 Keywords: genetic algorithm; computer based assessment; test generating; question bank INTRODUCTION With the development of Information and Networking Technologies, technology based education methods have become the concern of modern education. E-learning, by using Internet, provides life-long learning and distance learning. The world-wide-web has been reviewed as efficient knowledge transfer, collaboration, and learning systems, in an interactive manner. Among numerous components of e-learning, assessment is an important part and plays a vital role in teaching and learning. Computer-based assessment is attractive for saving marking time with large classes. It may also serve to shift the exam burden from lecturers to others, such as computer technicians administering the system [1]. However, computer-based assessment systems have some limitations. Construction of good objective questions requires skill and practice and so is initially time consuming. Assessors and invigilators need training in assessment design, and examinations management. A large question bank could facilitate exam preparation by allowing a lecturer to create a test by selecting questions, or the process could even be automated to create a randomly selected test. Under certain circumstances it is desirable to have a larger number of questions than would normally be needed for a single test, for security reasons [2]. Many computer-based assessment tests are mainly or exclusively of the multiple-choice type. Correspondence to M. Yildirim ([email protected]). ß 2009 Wiley Periodicals Inc. 298

A genetic algorithm for generating test from a question bank

Embed Size (px)

Citation preview

Page 1: A genetic algorithm for generating test from a question bank

A Genetic Algorithm forGenerating Test From aQuestion Bank

MEHMET YILDIRIM

Electronics and Computer Education Department, University of Kocaeli, 41380 Kocaeli, Turkey

Received 7 April 2008; accepted 12 June 2008

ABSTRACT: The purpose of this study is to provide academicians with efficient means of

generating tests with multiple-choice questions from a question bank. Genetic algorithm (GA)

is used to optimize predefined criteria for selecting questions from the question bank. GA is a

very useful optimization algorithm because of its versatility. However, crossover and mutation

operator of standard GA cannot be directly usable for generating test, since integer-coded

individuals have to be used and these operators produce duplicated genoms on individuals. In

this study, a mutation operation is proposed for preventing the duplications on crossovered

individuals. The experiments and analysis show that GA with proposed mutation operator is

successful as approximately 100%. �2009 Wiley Periodicals, Inc. Comput Appl Eng Educ 18: 298�305,

2010; Published online in Wiley InterScience (www.interscience.wiley.com); DOI 10.1002/cae.20260

Keywords: genetic algorithm; computer based assessment; test generating; question bank

INTRODUCTION

With the development of Information and Networking

Technologies, technology based education methods

have become the concern of modern education.

E-learning, by using Internet, provides life-long

learning and distance learning. The world-wide-web

has been reviewed as efficient knowledge transfer,

collaboration, and learning systems, in an interactive

manner. Among numerous components of e-learning,

assessment is an important part and plays a vital role

in teaching and learning.

Computer-based assessment is attractive for

saving marking time with large classes. It may also

serve to shift the exam burden from lecturers to

others, such as computer technicians administering

the system [1]. However, computer-based assessment

systems have some limitations. Construction of good

objective questions requires skill and practice and so

is initially time consuming. Assessors and invigilators

need training in assessment design, and examinations

management.

A large question bank could facilitate exam

preparation by allowing a lecturer to create a test by

selecting questions, or the process could even be

automated to create a randomly selected test. Under

certain circumstances it is desirable to have a larger

number of questions than would normally be needed

for a single test, for security reasons [2].

Many computer-based assessment tests are

mainly or exclusively of the multiple-choice type.Correspondence to M. Yildirim ([email protected]).

� 2009 Wiley Periodicals Inc.

298

Page 2: A genetic algorithm for generating test from a question bank

The traditional descriptive answer questions are

generally easier to set but it must be read and assessed

by human graders. Grading of descriptive answer

examinations is therefore slow, costly and suffers

from the foibles of human variation in judgment and

performance [3]. For a good many of students or for

ongoing automated assessment, multiple-choice tests

are highly advantageous. They are much quicker to

grade and give near-instant results to academicians

and students. In addition, the examination can also be

done on-line at a distance and can use to advantage all

the powerful opportunity of multimedia in presenting

the questions. With growing numbers of students

and increasing demands on academicians, the large

reduction in time and cost offered by multiple-choice

tests, is making their use mandatory.

There are many studies in the literature that are

mostly based on random selection of questions from a

question bank. However, either these works do not use

any criteria for question selection or use some criteria

but do need the participation of the academician to

optimize them [4]. If lots of questions are selected,

human optimization will be failed. Assessments can

be designed which draw a certain number of

questions, at random, thereby producing a unique

subset of questions. If no criteria are used, all of

the questions in question bank are feasible and can

be selected to create a unique subset. Thus, all of

the questions in the subset may be difficult or easy.

Moreover, frequently asked questions in the past

examinations may be selected again in new subset.

Even more, a question may be selected twice or more

in the same test. If some criteria are used, number of

feasible questions in question bank is reduced, it may

even be less than the predefined number of questions

of a certain test size. In a condition like this, an

optimization algorithm is needed to select the most

appropriate unfeasible questions to complete the

number of questions of a certain test size.

The purpose of this study is to provide academi-

cians with efficient means of generating tests with

multiple-choice questions from a question bank. In

this study, GA is used to optimize predefined criteria

for selecting questions from a question bank. GA is a

very useful optimization algorithm because of its

versatility. However, the crossover and mutation

operator of standard GA cannot be directly usable

for generating test, since integer-coded individuals

have to be used and these operators produce

duplicated genoms on individuals. In order to solve

this problem, a mutation operation is proposed for

preventing the duplications on crossovered individu-

als and also directing the search randomly to the new

spaces. A database containing 240 classified test

questions was created together with predefined

attributes for selecting questions. By using GA,

25 tests were generated. In the generatings, five

certain number of test sizes and five difficulty levels

were requested. The average difficulties, the number

of feasible questions, the number of violated questions

and the successes of generatings are observed. The

experiments and analysis show that GAwith proposed

mutation operator is greatly successful on feasible

question selection.

STANDARD GENETIC ALGORITHM

GA is a parallel and global search technique that

emulates natural genetic operators. Because it simul-

taneously evaluates many points in the parameter

space, it is more likely to converge toward the global

optimum. It is not necessary that the search space be

differentiable or continuous. GA applies operators

inspired by the mechanics of natural selection to a

population of binary strings encoding the parameter

space. At each generation, it explores different areas

of the search space, and then directs the search to

regions where there is a high probability of finding a

better solution.

The overall GA optimization system [5] is

described by the schematic in Figure 1. GA starts

with an initial population of coded strings, which

are generally called individual or chromosome and

randomly selected. An individual is a potential

solution of the problem and represents a set of

parameters. The size of population varies from one

problem to another.

Each individual in the population is then assigned

a probability of survival, in other words fitness,

according to the objective function values of that and

other individuals. The objective function of a problem

is a main source providing the mechanism for

evaluating the status of each individual. This is an

important link between the problem and GA. It takes

the individual as an input and produces a number as a

Figure 1 A simple genetic algorithm.

GA FOR GENERATING TEST FROM A QUESTION BANK 299

Page 3: A genetic algorithm for generating test from a question bank

measure to individual’s performance. However, its

range of values varies from problem to problem. To

maintain uniformity over various problem domains,

objective function value is rescaled to a fitness value

[6].

Candidate individuals from populations that

might survive into the next generation are selected

based on their fitness values. This means that only

genetically good individuals are selected to be

parents. The selection process may select the fittest

individuals more than once in the mating population.

Roulette wheel selection is a commonly used method

for selection.

After the selection, crossover and mutation

operations take place respectively. Pairs of parent

individuals from mating population are selected

randomly. Generally, crossover combines the features

of two parent chromosomes to form two offsprings,

with the possibility that good chromosomes may

generate better ones [7]. A random point among the

each individual pair is chosen as the crossover point as

shown in Figure 2. The bits of two individuals after

the crossover point are swapped with a probability of

crossover rate [8]. Crossover operation expands the

search space around the fittest individuals [9].

Next, all candidate individuals in the population

are subjected to the random mutation. This is a bit-

wise binary complement operation (see Fig. 3),

applied uniformly to all bits of all individuals in the

population with a probability of mutation rate.

Mutation rate is comparatively lower than crossover

rate. The mutation operation expands the search space

to regions that may not be close to the current

population, thus ensuring a global search [10].

In the case of generation replacement, the

individuals in the old population are replaced by the

offspring. Since the best individuals of the population

may fail to reproduce offspring in the next generation,

it is usually combined with an elitist strategy such that

one or a number of the best individuals can be copied

into the new generation.

The cycle of evolution is repeated until a desired

termination criterion is reached. This criterion can be

set by the number of evolution cycles (computational

runs) or a predefined value of objective function [11].

APPLICATION OF GA TO THE TESTGENERATION PROBLEM

The Form of the GA Individual

In much early GA work the individuals were binary-

coded, but more generally individuals may be binary,

integer or decimal-coded, and may also take matrix

form. In a test preparation problem, it is possible to

use binary-coded individuals. However in this case,

the individual size would be greater and conversion

between binary and integer individuals take much

computational time. Therefore, in this study, the

question numbers to be selected are represented by an

integer-coded individual:

x ¼ fQuestion1;Question2; . . . ;QuestionMgwhere M is the predefined test size and x2DM is the

vector of decision variables. In Figure 4, the

population with N individuals is shown.

The Form of the Objective Function

The penalty function method, in which the objective

function value is degraded by some function of

constraint violation, has been the most popular for

constrained optimization by GA [12]. A penalty

function defines the fitness value of an infeasible

individual [13].

In the present study, the idea is to apply a set of

criteria to decide the selection process as follows:

* Any feasible solution is preferred to any

infeasible solution.* Between two feasible solutions, any of them is

preferred.

Figure 2 Crossover for binary individuals.

Figure 3 Mutation for binary individuals.

Figure 4 Integer coded population.

300 YILDIRIM

Page 4: A genetic algorithm for generating test from a question bank

* Between two infeasible solutions, the one having

smaller constraint violation is preferred.

Based on these criteria, the objective function of

the constrained optimization is

FðxÞ ¼ f ðxÞ þXm

1

violðxÞ ð1Þ

where f(x) is the objective function value, and viol(x)

the summation of all the violated constraints, such that

viol(x)¼ 0 if x is feasible and viol(x)> 0 otherwise.

The constraints are:

* Frequently selected questions in previous tests

are unfeasible. A previously unselected question

violation is equal to 0 and it is feasible; violation

of a question selected once is equal to 1 and

so on.* Difference between the difficulty of selected

question and the requested difficulty level must

be zero or minimum.

Crossover and Mutation

During the crossover step of the algorithm, segments

are cut-and-spliced between individuals. In some

implementations of the GA this procedure presents no

particular problems. For example, if individuals are

binary-coded, and every conceivable binary individ-

ual is valid, crossover can be allowed to proceed

without interference. However, in test generating

problems, since the integer-coded individuals are used,

crossover frequently generates illegal offsprings. For

example, if two parents are crossovered as shown in

Figure 5, the produced offsprings are clearly illegal,

since each of them have duplicated question numbers.

It might seem reasonable to try to solve this difficulty

by assigning a fitness of zero to all illegal individuals,

to prevent the algorithm from reproducing them.

However, this is an inefficient way to proceed. As the

length of the individuals are increase, the probability

that crossover will generate illegal individuals grows

very rapidly, to the point at which, for very long

individuals, crossover almost always generates illegal

progeny. Much of the calculation time is then spent

assessing worthless individuals.

It is far better to attempt repair of the individuals,

in a manner which makes them acceptable by

removing duplicates. In this study, a mutation

operation is proposed for preventing the duplications

in offsprings and also directing the search randomly to

the new spaces.

In Figure 6, the flowchart of proposed method is

shown. It must be again said that the mutation

operation should expand the search space to the

regions that may not be close to the current

population, in order to ensure a global search. If one

of the duplicated parameters in illegal offspring is

replaced with a randomly selected one, which is not

member of the illegal offspring, both the duplication

problem is solved and the search space can be

expanded to new regions. Such a mutation operation

seems like a part of crossover operation and the

mutation operation itself.

EXPERIMENTS AND ANALYSIS

A question bank database, which has 240 questions,

was created firstly for generation test with GA. Each

question is characterized with some scalar attributes,

such as question number, difficulty of the question,

chapter and sub-title number that the question belongs

to, and history of appearance on previously generated

tests. In Table 1, a simplified version of question bank

database is shown. It is necessary that the difficulty

levels of questions are predefined in the question

bank. In practice, the decision of which difficulty level

a question belongs to is not an easy task. Difficulty

level of a question is determined once by the

Figure 5 Crossover for integer individuals.

GA FOR GENERATING TEST FROM A QUESTION BANK 301

Page 5: A genetic algorithm for generating test from a question bank

instructor when the question is created into question

bank database. After that, it is updated according to

statistical data about students’ performance on each

question. History of appearance field on the question

bank database is modified when generating a new test.

This field is incremented by 1 for all questions in the

new generated test. This gives less chance to these

questions than unselected questions for the next

generating of a new test.

Since it is convenient for representation of

question number, integer-coded individuals were used

in GA. In order to select the fittest individuals for

moving them to the next generation, roulette-wheel

selection mechanism was used. The best individual

was directly copied into the next generation by using

elitism. Effects of population size, crossover method

and maximum generation number were investigated

and crossover method was taken as one-point, the

population size was taken as 100, and the maximum

generation number was taken as 1,000.

By using GA, 25 tests were generated, each one

was different from the others. While generating the

Table 1 Question Bank Database

Figure 6 Flowchart of duplication prevention with mutation.

302 YILDIRIM

Page 6: A genetic algorithm for generating test from a question bank

tests, five certain numbers of test sizes and five

difficulty levels were requested. The results of

generated tests are given in Table 2. In the table, the

requested test sizes and difficulties, and the number of

feasible questions in the question bank according to

requested values are shown. In addition, the average

difficulties of generated tests, the number of feasible

questions in each generated test, the number of

violated questions that historically appeared in

previous tests, the number of violated questions that

have different difficulty from the requested difficulty.

Finally, the table shows the feasible question selection

successes.

When the requested test size is less than the

number of feasible questions, GA is able to select

all feasible questions easily. If it is greater than the

number of feasible questions, GA firstly selects the

all feasible ones and then the least violated ones to

complete test size. For example; when 20 questions

and 1 difficulty level is requested, GA selects the

questions all of which are feasible, have the difficulty

of 1, and which have not appeared in previous tests.

When 40 questions and 1 difficulty level is requested,

GA selects 24 feasible questions, 7 previously

appeared questions, and 9 questions with different

difficulty. In this generating, GA is able to select all of

the feasible questions firstly, and then selects the least

violated questions to complete 40 questions. This

generating has 100% success, since GA is able to

select the most appropriate questions.

The generation numbers, at which the solutions

are obtained by GA, shown in Figure 7 support the

comment given above. For all test sizes, GA finds a

solution in less generations when the requested test

size is less than the feasible question size (see

difficulty of 2, 3, and 4 for all test sizes). However,

as the requested test size grows generation number of

solution increases, since the individual size used in

GA is extended.

CONCLUSION

In this study, a GA was used to optimize predefined

criteria for selecting questions from the question bank.

The crossover and mutation operator of standard GA

cannot be directly usable for generating test, since

integer-coded individuals have to be used and these

Table 2 Results of Generated Tests With Different Test Size and Difficulty Request

Test

size

Requested

difficulty

Size of

feasible

subset

Results of generated test

Success

(%)

Average

difficulty

Feasible

questions

History

violations

Difficulty

violations

20 1 24 1 20 0 0 100

2 72 2 20 0 0 100

3 120 3 20 0 0 100

4 72 4 20 0 0 100

5 24 5 20 0 0 100

24 1 24 1 24 0 0 100

2 72 2 24 0 0 100

3 120 3 24 0 0 100

4 72 4 24 0 0 100

5 24 5 24 0 0 100

40 1 24 1.22 24 7 9 100

2 72 2 40 0 0 100

3 120 3 40 0 0 100

4 72 4 40 0 0 100

5 24 4.83 24 9 7 100

50 1 24 1.26 24 13 13 100

2 72 2 50 0 0 100

3 120 3 50 0 0 100

4 72 4 50 0 0 100

5 24 4.76 24 14 12 100

70 1 24 1.31 24 24 22 100

2 72 2 68 2 0 97.1

3 120 3 70 0 0 100

4 72 4 68 2 0 97.1

5 24 4.68 24 24 22 100

GA FOR GENERATING TEST FROM A QUESTION BANK 303

Page 7: A genetic algorithm for generating test from a question bank

operators produce duplicated genoms on individuals.

In order to solve this problem, a mutation operation

is proposed for preventing the duplications on cross-

overed individuals and also directing the search

randomly to the new spaces.

A database containing classified test questions

was created together with predefined attributes for

selecting questions. By using GA, 25 tests were

generated. In the generatings, five certain numbers of

test sizes and five difficulty levels were requested. The

average difficulties, the number of feasible questions,

the number of violated questions and the successes

of generatings are observed. The experiments and

analysis show that GA with proposed mutation

operator is successful as approximately 100% in

feasible question selection.

An other result of the study is that, when

the requested test size is less than the number of

feasible questions, GA is able to select all feasible

questions easily. If it is greater than the number of

feasible questions, GA firstly selects the all feasible

ones and then the least violated ones to complete test

size.

One more result is that, for all test sizes, GA finds

a solution in less generations when the requested test

size is less than the feasible question size. However, as

the requested test size grows generation number of

solution increases, since the individual size used in

GA is extended.

REFERENCES

[1] M. Thelwall, Computer-based assessment: a versatile

educational tool, Comput Educ 34 (2000), 37�49.

[2] T. Fei, W. J. Hag, K. C. Toh, and T. Qi, Question

classification for e-learning by artificial neural net-

work. In: 4th International Conference on Information,

Communications and Signal Processing-PCM, Singa-

pore (2003), 1757�1761.

[3] R. W. Brown, Multi-choice versus descriptive exami-

nations. In: 31st ASEE/IEEE Frontiers in Education

Conference, Reno-NV (2001) T3A13-18.

Figure 7 Generation numbers of solutions with different test sizes and difficulties.

304 YILDIRIM

Page 8: A genetic algorithm for generating test from a question bank

[4] J. Protiæ, D. Bojiæ, and I. Tartalja, Test: tools for

evaluation of students’ tests—A development experi-

ence. In: 31st ASEE/IEEE Frontiers in Education

Conference, Reno-NV (2001) F3A6-12.

[5] D. Prabhu, B. P. Buckles, and F. E. Petry, Genetic

algorithms for scene interpretation from prototypical

semantic description, Int J Intel Syst 15 (2000),

901�918.

[6] K. F. Man, K. S. Tang, and S. Kwong, Genetic

algorithms: concepts and applications, IEEE Trans Ind

Electron 43 (1996), 519�533.

[7] F. Herrera, M. Lozano, and A. M. Sanchez, A

taxonomy for the crossover operator for real-coded

genetic algorithms: An experimental study, Int J Intel

Syst 18 (2003), 309�338.

[8] W. M. Spears and V. Anand, A study of crossover

operators in genetic programming. In: Proceedings of

the 6th International Symposium on Methodologies

for Intelligent Systems, Springer-Verlag, 1991,

pp 409� 418.

[9] K. Kristinsson and G. A. Dumont, System identifica-

tion and control using genetic algorithms, IEEE T Syst

Man Cybern 22 (1992), 1033�1046.

[10] M. Yildirim, K. Erkan, and S. Ozturk, Power

generation expansion planning with adaptive simulated

annealing genetic algorithm, Int J Energ Res 30 (2007),

1188�1199.

[11] F. Zhu and S. U. Guan, Ordered incremental training

with genetic algorithms, Int J Intel Syst 19 (2004),

1239�1256.

[12] N. Kumar and K. Shanker, A genetic algorithm for

FMS part type selection and machine loading, Int J

Prod Res 38 (2000), 3861�3887.

[13] A. C. C. Lemonge and H. J. C. Barbosa, An adaptive

penalty scheme for genetic algorithms in structural opti-

misation, Int J Numer Meth Eng 59 (2004), 703�736.

BIOGRAPHY

Mehmet Yildirim received the bachelor’s

degree in electronics and computer education

from Marmara University, Istanbul, in 1995

and the PhD degree from Kocaeli University

in 2003. He is currently an assistant professor

in the department of electronics and com-

puter education, Kocaeli University, Turkey.

His research interests include electrical

energy generation planning, dynamic system

modeling, computer networks, genetic algorithms and simulated

annealing.

GA FOR GENERATING TEST FROM A QUESTION BANK 305