[IEEE 2010 IEEE 10th International Conference on Advanced Learning Technologies (ICALT) - Sousse, Tunisia (2010.07.5-2010.07.7)] 2010 10th IEEE International Conference on Advanced

An Adaptive Method for Selecting Question Pools using C4.5

Ahmad Mustafa Seet

Computer Science and Engineering American University of Sharjah,

Sharjah, UAE [email protected]

Imran A. Zualkernan Computer Science and Engineering

American University of Sharjah Sharjah, UAE

[email protected]

Abstract— A number of adaptive mechanisms for asking questions have been proposed. This paper presents a novel adaptive method for selecting questions from various pools of questions. The method uses a state-diagram to represent various question pools where transitions between the states are dynamically modified to generate the optimal pool of questions for the next question. C4.5 is used to close the loop on adaptive selection of questions. A case study showing how the method was implemented for an online environmental game is presented and evaluated.

Keywords- adaptive testing; C4.5; selecting questions;

I. INTRODUCTION Asking questions is a key component of both formative and summative assessment. Recent studies have shown online formative assessment to be of value [1][2]. Most learning management systems (LMS) also support some form of an assessment engine. In addition, XML Standards like QTI [3] have also evolved to deliver and track online assessments. Adaptive testing seeks to personalize or adapt the questions being asked to an individual learner’s capabilities. It is important to distinguish between adaptations of questions within learners from adaption across learners [4]. Adaptation within learners relies on an individual learner while adaptation across learners relies on population of learners. For example, Item Response-base Theory (IRT) is a well-known psychometric technique that has been developed to select an optimal set of questions for an individual learner based on data collected from a large population of learners [5]. Recent extensions to this method using a Bayesian approach [6] and particle swarm optimization techniques [7] have also been proposed. Adaptive Conjoint Estimation (ACE) is an example of a technique for adaptation within learners [4]. Knowledge space theory is yet another approach that uses prior relationships between various questions to adapt the number of questions being asked to a particular learner based on their prior answers [8]. This approach has also been recently applied to model games and simulation-based learning [9]. Algorithms for question selection have also been investigated in the context of learning through demonstration [10]. Finally, classical optimization methods like SAT algorithms have also been used to adapt question asking strategies [11].

This paper presents a novel adaptive method that guides each learner through a pool of questions based on their

answers to prior questions. Unlike across learner adaptation methods like IRT, this method does not depend on the availability of a large pool of questions or on their relationships and makes minimal assumptions about relationships between pools. The rest of the paper is organized as follows. The paper first presents the method followed by a case study and an evaluation.

II. AN ADAPTIVE METHOD

A. Problem Formulation In both formative and summative assessment, sets or pools of questions are typically formulated to assess specific learning outcomes. Consider each set or pool of questions to consist of m questions; that is S = {q1, q2,…,qm}. Depending on a learner’s performance on a particular set of questions, they may be promoted to a different pool. For example, one may start with a set of questions on fundamental concepts and if the learner answers the questions correctly, they may be asked to answer more difficult questions related to an application of the concepts. However, if a learner is unable to answer questions in the more difficult set, they may be moved back to the easier set of questions. Transitions between sets of questions can be modeled using a state diagram. Fig. 1 shows such a state diagram. As Fig. 1 shows, the assessment contains three pools or sets of questions where each set is represented by a state. Transitions between states represent conditions under which a learner is allowed to move from one set of questions to another. The first type of transition from a state Si to another more advanced state Sk is indicated by an initial transition ratio ikP where ikP represents an acceptable threshold to move from state Si to Sk. This threshold value represents a minimum competency required to move on to a new set of questions. For example, ikP =0.7 means that if the learner has answered 70% of questions from set Si, then they should be moved to set Sk. A second type of transition is the backward transition. Instead of advancing, the learner retreats to a previous state, going from Sk to Si, which is indicated by

kiP and represents the threshold to move back from Sk to Si. The third and final type of transition is from Si to itself. This transition represents the threshold under which the learner will keep getting questions from the same set. The state diagram

2010 10th IEEE International Conference on Advanced Learning Technologies

978-0-7695-4055-9/10 $26.00 © 2010 IEEE

DOI 10.1109/ICALT.2010.31

86

also shows an initial state indicating the first set of questions to be asked.

Figure 1. State-diagram showing the question sets and transitions

B. The adaptive method The overall state of a learner at any point in time is represented by vector of transition ratios for each set. For example, for the state diagram shown in Fig. 1, the overall state will be represented the vector { 13P , 21P , 12P , 23P , 32P }. In general, the problem of which next question to ask can be formulated as follows: Given a vector of current transition ratios for n set of transitions, find the best next state to be in. The basic idea behind the adaptive method is to generate synthetic data from the state diagram in a systematic manner and to use a machine learning method like C4.5 [12] to generate a decision procedure that determines the next best state. C4.5 is a statistical classifier that uses information entropy to create a decision tree given a training set T1, T2, …Tn [12]. Each training set Ti consists of a vector x1, x2, …xm of attributes followed by ci C where C is the set of classes c1, c2, …cm used for classification. In this formulation, each transition ratio Pi represents an attribute while each question set represents a class cj. Therefore, each training vector consists of the values of current transition ratios followed by the next state. As Fig. 2 shows, this sets up a closed-loop system where the user is asked a question and depending on the answer, the transition ratio vector is updated. Based on this vector, C4.5 is used to generate an optimal decision tree. The decision tree thus generated is used to select the next question to ask.

An algorithm as been developed to generate the data for the C4.5 that consists of a table of training vectors. Each column in this table represents a transition ratio Pi and the final column contains the question set for the next question. This algorithm generates data for a static decision tree based on the current threshold ratios. However, this process is augmented by updating the transition threshold ratios based on user’s answers. The basic intuition behind this process is that if a

user answers a question from a question set correctly, all the outgoing transitions from this set are decremented by a small increment ∆ thus making it “easier” to get out of this specific state. Similarly, if the user answers a question incorrectly, the outgoing transitions from the corresponding set are incremented making it harder to get out of the state. In this sense, after each question, the threshold ratios are changed hence resulting in a different decision tree after each question is asked and thus resulting in the feedback loop shown in Fig. 2.

III. CASE STUDY

A. Design A simple “game” consisting of various questions regarding the air quality on the AIRNOW website under Air Quality Index (AQI) for kids [13] was used to conduct an initial evaluation of the proposed method. The game called “AQI color game” is designed to be played by children from ages 10 to 16. The AQI game was re-implemented in Java. Questions from the original game were divided into five sets depending on the level of difficulty. Table 1 shows sample questions from each set. Fig. 3 shows the state diagram for the five question sets.

TABLE I. SAMPLE QUESTIONS FOR VARIOUS QUESTION SETS

Set Sample Question 1 Which color is unhealthy on the AQI? 2 The AQI is 299 today. What AQI color does that correspond to? 3 What is/are the health word(s) for a red day? 4 The AQI color is orange that means the air is unhealthy for what

groups? 5 When can dirty air be most dangerous?

21P

32P

43P

54P

54P

Figure 3. State diagram showing relations between various sets of questions.

The data generation algorithm was used to generate data from the state transition diagram. C4.5 Java library (J48) provided in WEKA [14] was then used to generate the successive decision trees. Fig. 3 shows a decision tree generated in the beginning of one user session. Based on current values of the various transition ratios, the tree leads to the selection of one of the five sets of questions (leaves) to choose from. B. Evaluation The game thus created resulted in a smooth user experience where the questions being asked depended on how well the learner knew the questions in a specific set. However, in order to do a quantitative evaluation, a Monte Carlo simulation was conducted. Different types of learners were simulated by varying the probability with which they could answer any of

P1

P3 P4 P5

P2

Construct new table

Construct new tree Read User input

Display question to user Get next question Ask Qs

C4.5 Generate Data

User

Figure 2. A closed loop system, acquiring the next question from the tree

87

the questions correctly. For example, a simulated learner with a probability p=0.5 answered questions (in any set) correctly half of the time (50%). One thousand trials each for various probabilities representing various types of learners were conducted and the number of questions asked before the game ended was recorded. Fig. 5 shows the mean values for the number of questions asked for each probability.

Figure 4. The initial decision tree (no questions answered)

Figure 5. Results of Monte Carlo simulation (n=1000 for each point)

Another interesting question is a quantitative measure of how the decision trees selecting the question set change as different questions are answered. Levenshtein distance (LD) [14] is used to measures the distance between two trees (before and after asking a particular question). Fig. 6 shows the LD for a specific type of learner (p=0.7). As Fig. 6 shows the structure of the decision tree changes little initially but as the learner answers additional questions, the structural changes in the decision tree are more drastic. For example, the tree before and after the 10th question are significantly different (LD > 97).

IV. CONCLUSION

This paper has proposed a novel method for selecting pools of questions while making minimum assumptions about relationships between questions or the availability of a large pool of questions. The method was implemented and shown to work well in a limited question-based “game.” We are currently generalizing the algorithm to generate QTI assessments based on an initial state diagram of pool of

questions. Similarly, we are exploring a sigmoid function rather than a linear increment and decrement method for ∆.

Figure 6. Levenshtein distance between decision trees

ACKNOWLEDGMENT

The research presented here was supported in part by a faculty research grant from the IBM Corporation.

REFERENCES

[1] T. Wang, “Web-based dynamic assessment: Taking assessment as teaching and learning strategy for improving students’ e-Learning effectiveness,” Computers & Education, 2010 (in press).

[2] Costa, D. S. J., B. A. Mullan, E. J. Kothe and P. Butow,“A web-based formative assessment tool for Masters students: A pilot study,” Computers & Education, 2009 (in press).

[3] IMS Question and Test Interoperability Specification, available at http://www.imsglobal.org/question/, Accessed January 25, 2010.

[4] O. Toubia, D. I. Simester, J. R. Hauser, and E. Dahan,”Fast Polyhedral Adaptive Conjoint Estimation,” MARKETING SCIENCE, vol. 22, no. 3, Summer 2003, pp. 273-303.

[5] P.D. Boeck and M. Wilson (eds), Explanatory Item Response Models, Springer-Verlag, New York, 2004.

[6] O. Vozar and M. Bielekova, “Adaptive test question selection for web-based educational systems,” Proceedings of the 2008 Third International Workshop on Semantic Media Adaptation and Personalization, pp. 164-169, 2008.

[7] Y. Huang,Y. Lin and S. Cheng, “An adaptive testing system for supporting versatile educational assessment,” Computers & Education vol. 52, pp. 53–67, 2009.

[8] J. P. and J. C. Falmagne, "Spaces for the assessment of knowledge", International Journal of Man-Machine Studies, vol. 23, pp.175–196, 1985.

[9] M. D. Kickmeier-Rust, C. Hockemeyer, D. Albert, & T. Augustin,”Micro adaptive, non-invassive assessment in educational games, “ in M. Eisenberg, Kinshuk, M. Chang, & R. McGreal (Eds.), Proceedings of the second IEEE International Conference on Digital Game and Intelligent Toy Enhanced Learning, pp. 135-137, November 17-19, 2008, Banff, Canada.

[10] M. Gervasio, K. Myers, M. desJardins and F. Yaman “Question Asking to Inform Preference Learning:A Case Study,” Proceedings of the AAAI 2009 Spring Symposium Agents that Learn from Human Teachers, November 5–7, 2009, The Westin Arlington Gateway, Arlington, Virginia.

[11] J. Straach and K. Truemper,”Learning to ask relevant questions,” Artificial Intelligence, vol. 111, pp. 301–327, 1999.

[12] J. Ross Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann Publishers, Inc., 1993.

[13] AirNow, available at http://www.airnow.gov/, Accessed January 25, 2010.

[14] Weka 3 – Data Mining Software in Java, available at http://www.cs.waikato.ac.nz/ml/weka/, Accessed January 25, 2010.

[15] V. Levenshtein,"Binary codes capable of correcting deletions, insertions, and reversals", Soviet Physics, vol. 10, pp 707–10, 1966.

5

6

7

8

9

10

11

12

Ave

rage

num

ber o

f que

stio

ns

Correctness Probability (in %)

The average distribution over 1000 runsfor ∆ = 0.1

Mean

95% CI +

95% CI -

0

20

40

60

80

100

1 2 3 4 5 6 7 8 9 10 11 12 13

Number of question asked

Leve

nsht

ein

dist

ance

88

Documents

[IEEE 2010 IEEE 10th International Conference on Advanced Learning Technologies (ICALT) - Sousse, Tunisia (2010.07.5-2010.07.7)] 2010 10th IEEE International Conference on Advanced