14
e-Proceeding of the Global Conference on Technology in Language Learning 2015 (e -ISBN 978-967- 0792-03-3). Meliá Hotel Kuala Lumpur, Malaysia. Organized by http://worldconferences.net/home/ 1ST GLOBAL CONFERENCE ON TECHNOLOGY IN LANGUAGE LEARNING 2015 (GLIT2015) JUNE 2015 189 CONDUCTING AND EXAMINING EFFECTIVENESS OF ONLINE ASSESSMENTS USING THE MODULAR OBJECT-ORIENTED DYNAMIC LEARNING ENVIRONMENT (MOODLE) Rosyati Abdul Rashid & Wan Nurhafeza Wan Salam Department of Languages and Communication Center for Foundation and Liberal Education Universiti Malaysia Terengganu (UMT) [email protected] [email protected] ABSTRACT This paper aims to demonstrate the robustness of Moodle, a type of Learning Management System (LMS) in assisting language instructors to administer and analyze the quality of test items, and the test as a whole. Using Moodle Quiz Module, a public speaking test comprising 40 multiple-choice questions (MCQ) was created, and administered online to 131 students. Descriptive statistics obtained through the statistical analysis procedures performed by Moodle indicated that the test was reliable, and was a difficult one for many students since many of the test items were of high levels of Bloom’s taxonomy of educational objectives. The test scores obtained by 58 students who also took the Cornell Critical Thinking Test (CCTT), Level X, were found to be significantly correlated, at a moderate level, with the scores that the students had in the public speaking test. Data generated by Moodle such as the data from the analysis of students’ responses to each test item, and the data provided by items statistics like item facility (IF), item discrimination (ID), and distractor efficiency (DE) indices, were found to be very useful in helping the test developers to examine the effectiveness of test items and to later improve them so that only good items can be kept in the test bank for future use in assessing a different group of students. Field of Research: Moodle, critical thinking, multiple-choice questions, item analysis, IF, DI, and DE indices ---------------------------------------------------------------------------------------------------------------------------------- 1. Introduction To sustain the ability to compete in the present twenty-first century, Malaysia has to produce knowledgeable and skilled workforce who are critical in thinking and be able to function well in a variety of work conditions. Realizing the urgent need to generate the desired citizens, the Malaysian government has recently launched the Malaysia Education Blueprint 2013-2015 which lists eleven shifts needed to transform the Malaysian education system. One of the shifts is to “provide equal access to quality education of an international standard”. Among the listed actions to be taken to realize this shift is the act of revamping examination questions, so as to allow a greater percentage of questions at higher- order thinking levels to be included in school and all national examinations by the year 2016. This newly revised education policy initiative is hoped to successfully assist the nation in generating high-quality workforce that will help to steer the country towards greater prosperity. However, to the skeptics, it seems questionable that such reform in school and national assessments

CONDUCTING AND EXAMINING ... - WorldConferences.net paper... · 1ST GLOBAL CONFERENCE ON TECHNOLOGY IN LANGUAGE LEARNING 2015 ... This paper aims to demonstrate ... the UPSR and SPM

Embed Size (px)

Citation preview

e-Proceeding of the Global Conference on Technology in Language Learning 2015 (e -ISBN 978-967-

0792-03-3). Meliá Hotel Kuala Lumpur, Malaysia. Organized by http://worldconferences.net/home/

1ST GLOBAL CONFERENCE ON TECHNOLOGY IN LANGUAGE LEARNING 2015 (GLIT2015) JUNE 2015

189

CONDUCTING AND EXAMINING EFFECTIVENESS OF ONLINE ASSESSMENTS USING THE MODULAR OBJECT-ORIENTED

DYNAMIC LEARNING ENVIRONMENT (MOODLE)

Rosyati Abdul Rashid & Wan Nurhafeza Wan Salam

Department of Languages and Communication

Center for Foundation and Liberal Education

Universiti Malaysia Terengganu (UMT) [email protected]

[email protected]

ABSTRACT

This paper aims to demonstrate the robustness of Moodle, a type of Learning Management System (LMS) in assisting language instructors to administer and analyze the quality of test items, and the test as a whole. Using Moodle Quiz Module, a public speaking test comprising 40 multiple-choice questions (MCQ) was created, and administered online to 131 students. Descriptive statistics obtained through the statistical analysis procedures performed by Moodle indicated that the test was reliable, and was a difficult one for many students since many of the test items were of high levels of Bloom’s taxonomy of educational objectives. The test scores obtained by 58 students who also took the Cornell Critical Thinking Test (CCTT), Level X, were found to be significantly correlated, at a moderate level, with the scores that the students had in the public speaking test. Data generated by Moodle such as the data from the analysis of students’ responses to each test item, and the data provided by items statistics like item facility (IF), item discrimination (ID), and distractor efficiency (DE) indices, were found to be very useful in helping the test developers to examine the effectiveness of test items and to later improve them so that only good items can be kept in the test bank for

future use in assessing a different group of students.

Field of Research: Moodle, critical thinking, multiple-choice questions, item analysis, IF, DI, and DE indices

----------------------------------------------------------------------------------------------------------------------------------

1. Introduction

To sustain the ability to compete in the present twenty-first century, Malaysia has to produce knowledgeable and skilled workforce who are critical in thinking and be able to function well in a variety of work conditions. Realizing the urgent need to generate the desired citizens, the Malaysian government has recently launched the Malaysia Education Blueprint 2013-2015 which lists eleven shifts needed to transform the Malaysian education system. One of the shifts is to “provide equal access to quality education of an international standard”. Among the listed actions to be taken to realize this shift is the act of revamping examination questions, so as to allow a greater percentage of questions at higher- order thinking levels to be included in school and all national examinations by the year 2016. This newly revised education policy initiative is hoped to successfully assist the nation in generating high-quality workforce that will help to steer the country towards greater prosperity. However, to the skeptics, it seems questionable that such reform in school and national assessments

e-Proceeding of the Global Conference on Technology in Language Learning 2015 (e -ISBN 978-967-

0792-03-3). Meliá Hotel Kuala Lumpur, Malaysia. Organized by http://worldconferences.net/home/

1ST GLOBAL CONFERENCE ON TECHNOLOGY IN LANGUAGE LEARNING 2015 (GLIT2015) JUNE 2015

190

can be implemented successfully within the target time considering the rather short time frame given to actualize the reform and the current somewhat less satisfactory teachers’ literacy

assessment level.

2. Literature review

2.1 Educational reforms in Malaysia and teachers’ assessment literacy

Many reform efforts have been taken by the Malaysian government to improve the national education system (Rajendran, 2001). The first reform made to the educational policy was based on the Razak Report prepared prior to the country gaining its independence in 1957 and the most recent one is written in the Malaysia Education Blueprint 2013-2025. One of the outcomes of the reform efforts was the inclusion of critical thinking or higher-order thinking skills in the objective of the secondary school curriculum (Curriculum Development Center, 1989, p.2), in the Integrated Curriculum for Secondary Schools (ICSS), in the National Higher Education Action Plan which stressed that the focus of the education system was “on generating world-class and holistic human capital who are intellectually active, creative and innovative, ethically and morally upright, adaptable and capable of critical thinking" (MOHE, 2007, p.8), and in the latest educational blueprint which states that “the national curriculum aims to create Malaysian students that are balanced, resilient, inquisitive, principled, informed, caring, patriotic, as well as effective thinker, communicator, and

team player” (Malaysia Education Blueprint 2013-2025, exhibit.4-2).

Although producing students with critical thinking ability has been considered as one of the educational objectives for quite sometimes already, the question as to whether the goal is achievable has been frequently raised. For example, Indramalar in 1997 (as cited in Rajendran, 2001, p.3) reported the Minister of Education saying that “the education system will be revamped to encourage rational and analytical thinking”. The statement implies that the education system at that particular time failed to realize the educational objective of producing critical thinkers. One of the measures taken then to address the issue was to revamp public examination papers. A policy announced in 1994 targeted that by year 2000, sixty percent of examination papers should comprise

questions of the higher-order thinking type.

The issue seems to have recurred recently when the analysis done by Pearson Education Group of the UPSR and SPM English papers for the year of 2010 and 2011 revealed that about seventy percent of the questions in the aforementioned national examination papers assessed “basic skills of knowledge and comprehension”– the two low levels of cognitive domains in the Bloom’s taxonomy (Malaysia Education Blueprint, 2013-2025). This partly has led the Ministry of Education to include revamping examination questions as one of the main steps to be taken in realizing the first type of transformation or shift to be made to the education system in the this 21st century.

What actually went wrong? Perhaps Malaysian teachers are not yet fully prepared to assess students or they have not adequately been given training on assessment. According to Lim, Wun, and Chew (2014), Malaysian teachers do not have sufficient knowledge and skills to come up with assessments that can effectively measure the intended learning outcomes. The authors made this conclusion based on the findings of a few local studies cited in their article. For example, a recent study conducted by Suah (2012) involving 3,866 primary and secondary school teachers presented evidence which showed that majority of the teachers had less satisfactory literacy assessment level. The authors’ conclusion was further supported by works of Mohamad (2006) and Salbiah (1995) who revealed that teachers’ emphasis was more on the content of the syllabus than the target learning outcomes whenever they were developing and carrying out assessments. These pieces of evidence help to account for the low percentage of higher-order thinking questions included in the national

e-Proceeding of the Global Conference on Technology in Language Learning 2015 (e -ISBN 978-967-

0792-03-3). Meliá Hotel Kuala Lumpur, Malaysia. Organized by http://worldconferences.net/home/

1ST GLOBAL CONFERENCE ON TECHNOLOGY IN LANGUAGE LEARNING 2015 (GLIT2015) JUNE 2015

191

examination papers reported earlier in this paper.

It seems that much is said about teachers’ low assessment literacy level and the poor quality of examination papers, but little has been written about how to improve the level so that the quality of the papers can be enhanced. The new Education Blueprint explicitly stated that changes will be made to both the school assessment and national examination papers and that teachers will receive training and guidance from the National Examination Board. But experience and research have shown us that giving training alone will not solve the problem that we are encountering now. In fact, to date, the Malaysian Ministry of Education has supported teachers with various kinds of training and workshops including those on assessments and the teaching of thinking skills (Rajendran, 2001; Lim, Wun, & Chew, 2014). A study done by Rajendran (2001) showed that 60 percent of teachers in his study who went for training did not perceived themselves to be better prepared in teaching the higher- order thinking skills than those who were not given training. Their ability to assess or come up with examination questions to test the same skills could also be of not much difference. Perhaps what these teachers need, after getting a lot of training from the Ministry, is ample of opportunities

to practice creating and carrying out their own assessment plus the chance to improve them.

2.3 Online assessment

A plausible solution to the issue being addressed so far is to train and motivate teachers to conduct online assessment. This type of assessment require teachers to assess their students’ performance through an e-learning platform. This proposed solution seems to be a practical one considering the

fact that most schools and universities are equipped with computer laboratories.

Online assessment allows teachers and any educators to systematically attest their teaching practice and test development. It has been shown to improve teachers’ teaching and learning method; teachers are not only able to continuously evaluate their teachings but also concurrently, gauge their

students’ academic performance through online assessment.

2.3.1 Learning Management System (LMS): Moodle

Learning Management System (LMS) is defined as “software that has been used in a learning content presentation which has a significant role and complexity in e -learning environment” (Aydin and Tirkes, 2010, p.2). LMS is an e-learning platform utilized to support the move to improve educational quality and prepare wider range of options. In terms of assessment purpose, LMS provides different

range of testing and evaluation format, thus, offers flexibility in grading information.

Modular Object-Oriented Dynamic Learning Environment (Moodle) is one type of LMS popularly used today. Moodle, with its modular design, is able to support a variety different types of assessment which allows test developers to manipulate some test constraints such time, date and test duration (Aydin & Tirkes, 2010). It provides advanced online module in which questions can be created in different formats, such as multiple-choice questions, matching questions, and short answer questions. In addition, educators can prepare their online assessments based on their specified learning objectives. Moodle is one of the top choices of learning institutions in terms of robustness. Among the best features of Moodle are its notable outstanding ability to assist educators to improve pedagogical quality (Aydin & Tirkes, 2010), ease of maintenance or usability (Wright & Wright, 2011), and ‘reliability and functionality’ of teaching and learning content especially in regard to testing and assessment (Whelan & Bhartu, 2007, p.1055; Costa, Alvelos &

Teixeira, 2012).

e-Proceeding of the Global Conference on Technology in Language Learning 2015 (e -ISBN 978-967-

0792-03-3). Meliá Hotel Kuala Lumpur, Malaysia. Organized by http://worldconferences.net/home/

1ST GLOBAL CONFERENCE ON TECHNOLOGY IN LANGUAGE LEARNING 2015 (GLIT2015) JUNE 2015

192

4. Main Objective of the Study

The main objective of the present study was to demonstrate that online assessment can be carried out using Moodle, and that this LMS could assist test developers in examining the effectiveness of their test items, and also furnish them with data which would enable them to further improve the

quality of their tests.

4.1. Specific Objective

Specifically, this study aimed at assessing students’ critical thinking ability through a standardized test uploaded at the university e-learning website, and their performance in a public speaking test developed using the quiz module, one of the tools available in Moodle.

5. Research Questions

Follows are the research questions addressed in the study:

a. What is the reliability of the two tests used in the study?

b. Is students’ critical thinking related to their performance in the public speaking test?

c. How can item analysis statistics provided by Moodle help instructors improve the quality of

the public speaking test?

i. How many items are considered easy and how many are categorized as difficult (i.e., what is the IF index for each item) ?

ii. How good are the test items at dicriminating good students from weak ones (i.e., what is the ID index for each item) ?

iii. How good are the test item distractors (i.e., what is the DE index for test items) ?

6. Methodology

6.1 Sample

The original sample of the present study consisted of 131 students from four public speaking classes taught by the present teacher researchers. The final sample used to establish the relationship between the two main variables (i.e., performance in a public speaking test and critical thinking) in this study was only fifty- eight (58) since not all of the students attempted the critical thinking test

posted on the university e-learning site.

6.2 Instruments

Two tests were used in the study: Cornell Critical Thinking Skills Test (CCTST), Level X, and a public speaking test. The items for both tests were created using Moodle Quiz Module and were later

uploaded at the university e-learning site for students to access.

6.2.1 Cornell Critical Thinking Test Level X

Cornell Critical Thinking Skills Test (CCTST), Level X, was used to measure the critical thinking ability of the students involved in the present study. This standardized test was developed by Ennis, Millman, and Tomko (1985 & 2005) and later translated into Malay by Shaharom (2004). The test contains seventy one multiple-choice items and a score of one (1) mark is given to a correctly

e-Proceeding of the Global Conference on Technology in Language Learning 2015 (e -ISBN 978-967-

0792-03-3). Meliá Hotel Kuala Lumpur, Malaysia. Organized by http://worldconferences.net/home/

1ST GLOBAL CONFERENCE ON TECHNOLOGY IN LANGUAGE LEARNING 2015 (GLIT2015) JUNE 2015

193

answered test item leading to a total mark of seventy one (71) for performance on the test. A high

total score on the test is taken to reflect a high critical ability of the test taker.

6.2.2 Public Speaking Test

The public speaking test employed in this study had forty multiple -choice items. Each test item had five options from which a test taker had to choose the correct answer. The test was used to gauge what students had learned about the theories and guidelines provided in six chapters of The Art of Public Speaking, a textbook (2010) that was adopted in the related course. The chapters provided students with information on how to effectively prepare their speech outlines and texts before using them to rehearse for their speeches. The main objective of the public speaking course is to enable students to effectively prepare and deliver their speeches. The speech preparation process requires students to exercise their critical thinking skills so that their ability to discern relationships among

ideas are clearly indicated in their outlines and texts.

In the past, the public speaking test at University Malaysia Terengganu was administered by gathering eight hundreds to one thousand students in a few examination halls where test papers were distributed to them. However, the establishment of an upgraded university e -learning site has motivated language instructors to conduct the test online. The site adopted an LMS named Modular Object-Oriented Dynamic Learning Environment (Moodle) as a teaching and learning platform. Moodle has made it possible for the researchers and colleagues to construct and administer onl ine public speaking tests through its Quiz Module for the past six semesters. The earlier version of online multiple-choice-question (MCQ) tests given to the students relied heavily on the test items provided in the test bank of the adopted textbook. However, experience gained from administering online tests using such test items had informed the present teacher researchers the need to discard some of the test bank items and construct new items which suit local contexts to ascertain that the course educational goals could be met and that cheating among students could be prevented. In addition, the researchers felt the need to find a mechanism to allow them to administer the test without having to call for all language instructors to invigilate their respective groups of students taking the test in the computer laboratories. Observation of students’ behaviour while taking the test and analyses of the test results enabled the present researchers to gain insights into the problem. They discovered that students took longer time to answer test bank items of high levels of Bloom’s taxonomy of educational objectives. Inclusion of such items in tests given later to a different group of students, without the presence of instructors, showed that cheating was very much reduced. Since then, it has been considered practical to include in the test high level items, which require

students to apply, analyze, and evaluate information.

The frequent use of Moodle quiz module to create and administer online tests caused the teacher researchers to explore the potential of the LMS further. The exploration led to the discovery of a tool in the quiz module which helped the researchers to determine the reliability of a test conducted and the quality of the test items used. This tool is a statistical technique called item analysis procedure. Relevant descriptive statistics computed from this procedure are made available by Moodle at the statistics section of the quiz module. Following the discovery of such a useful tool, the researchers felt a strong urge to conduct the present research as an attempt to further improve the quality of the test given to the students. The researchers also wanted to determine the appropriateness of the decisions made by them throughout the previous test construction process

so that they would be better informed when constructing items for future tests.

6.2.3 Data analysis procedures

This quantitative study employed item analysis procedures offered by Moodle Quiz Module to

e-Proceeding of the Global Conference on Technology in Language Learning 2015 (e -ISBN 978-967-

0792-03-3). Meliá Hotel Kuala Lumpur, Malaysia. Organized by http://worldconferences.net/home/

1ST GLOBAL CONFERENCE ON TECHNOLOGY IN LANGUAGE LEARNING 2015 (GLIT2015) JUNE 2015

194

determine the reliability of two tests and examine the effectiveness of items in the public speaking test. The module allowed the researchers to export the collected data and convert them into different forms such as Microsoft Excel and SPSS data for further statistical analysis. This enabled the researchers to compute a correlation coefficient to determine the relationship between students’

performance on the two tests used in the study.

7. Results & Discussion

7.1. Cornell Critical Thinking Test (CCTT)

The analysis done on the data revealed that the CCTT was a reliable one. The alpha value computed was .73, a value which is within the range of reliability estimates in the test manual (Ennis et al. 1985 & 2005). The mean score for the test indicated that on average the students obtained 33 out 71 test items correct. This value is much lower than the average score obtained by the sample norm (consisting of American undergraduates) in the CCTT test manual. Although the computed mean did not reach the expected level, it still provides evidence that Malaysian undergraduates can think critically and thus help to reject the claim made by some western scholars that Asian students are not critical in their thinking. The lowest score obtained in the test was 18 and the highest was 50.

The mean score might probably be of a higher value if the number of sample was larger.

7.2. Public Speaking test

The reliability coefficient for the public speaking test was obtained from the Moodle quiz module by clicking the results and statistics blocks, respectively. As indicated in Figure 1, the computed reliability value (i.e., coefficient of internal consistency) was at a satisfactory level: α = .77. The Figure also illustrates that the test was difficult for the students since the test average score (i.e., the mean score) was only 54 percent. The skewness and kurtosis values were also below the value of

1.00 which indicates that the distribution of the data approaches the normal pattern.

Figure 1: Descriptive statistics for Public Speaking Test

e-Proceeding of the Global Conference on Technology in Language Learning 2015 (e -ISBN 978-967-

0792-03-3). Meliá Hotel Kuala Lumpur, Malaysia. Organized by http://worldconferences.net/home/

1ST GLOBAL CONFERENCE ON TECHNOLOGY IN LANGUAGE LEARNING 2015 (GLIT2015) JUNE 2015

195

7.2. Relationship between Critical Thinking Public Speaking

Pearson Product-Moment Correlation analysis conducted on the collected data showed that critical thinking was positively and significantly correlated to critical thinking. The computed value was .34 indicating that the relationship between the two variables was moderate. The correlation value also suggests that critical thinking exerts an influence on students’ performance in the public speaking test: that is, at a moderate level, students who got a high mark for the public speaking test also received a high score for critical thinking. This result implies that the effort taken by the researchers to include higher-order test items is a worthwhile effort since the test items do not only manage to test students’ comprehension of the content of their textbooks but also assess their ability to exercise their critical thinking skills.

7.3. Quality of Public Speaking Test Items

One of the objectives of conducting this research was to observe and learn how statistical analysis performed by Moodle enabled the teacher researchers to determine the efficacy of the test developed through the Moodle Quiz Module. It was hoped that the information gathered from the module would assist researchers in improving their test items, which would later be saved in the test

bank for future use.

7.3.1. Item analysis

Item analysis involves the application of statistical procedures on test scores to generate descriptive and item statistics which can assist test developers in re-examining the effectiveness of their test items and their test as a whole. These statistics can be computed manually using formulae provided in many or most books on statistics. However, the procedures to be performed are rather tedious which may cause many people not to consider using or applying them. In relation to this issue, the Moodle Quiz Module offers a practical solution. The module employs item analysis procedures which can generate the statistical data that enables any test developers to examine the

psychometric performance of each test item and the effectiveness of a test as a whole.

The two most useful item statistics provided in the report produced by the Moodle Quiz Module are item Facility Index (FI) and item Discrimination Index (DI). These indices are useful in assisting test developers in making informed-decisions regarding which items to retain, revise, and discard (Abdul Rashid, 1996; ScorePak®, 2005; Kaplan & Saccuzzo, 2009; Cohen, Swerdlik & Sturman, 2013). The two indices range from a value of 0 to 100. The FI is used to assess the difficulty level of a test item, while the DI is used to examine the ability of an item to discriminate or distinguish between high and low scorers on the test given. A test item will be considered an easy item if its FI value approaches the value of 100. An item that has a DI value approaching the zero value is deemed to have a poor

discriminating power.

There is no consensus among practitioners on the definite FI and DI value s that can be referred to in determining the difficulty of an item and its discrimination power. In designing a good MCQ test, Kaplan and Saccuzzo (2009) recommend the use of test items with FI value ranging from .30 to .70 because such items enable test developers to maximize the information about the differences among the test takers. This recommended difficulty value range is actually close to the value range

observed in the normal bell curve; that is, the value range of .25 to .75.

In this study, items with FI index lower than .30 were classified as difficult items while those with FI above .75 were categorized as easy items. The ideal difficulty value for the five -option multiple-choice test items used in this study was .60, a value suggested by Cohen et al. (2013). For the

e-Proceeding of the Global Conference on Technology in Language Learning 2015 (e -ISBN 978-967-

0792-03-3). Meliá Hotel Kuala Lumpur, Malaysia. Organized by http://worldconferences.net/home/

1ST GLOBAL CONFERENCE ON TECHNOLOGY IN LANGUAGE LEARNING 2015 (GLIT2015) JUNE 2015

196

discrimination index (DI), values proposed by ScorePak® (2005) were referred to. Items with DI above .30 are considered as items with good discriminating power, those with DI between.10-.29 as

items with fair power, and those with DI below .10 as items with poor power.

7.3.2. Item Difficulty Index (FI)

The results of an item analysis performed on the test data were obtained from the Moodle Quiz Module by clicking on the statistics block, which led the researchers to the page entitled Quiz structure analysis. The sorting of relevant data can be done at this particular page but the present researchers preferred to download the data and converted it into an excel data form in case there was a need to later transform it to the SPSS data. The data was then sorted to rank test items according to their FI and DI values. Analysis of the data based on these two values revealed that out of forty test items, twenty five were in the desirable FI range of .30-.70, nine items in the easy range, and six in the difficult range. Two of the items were found to be extremely easy since more than ninety percent of the students answered them correctly. These items may need to be replaced with more challenging items or with those having better FI values. One item (with FI value of .8) was

found to be extremely difficult since only three out of 131 students managed to answer it right.

7.3.3. Items Discrimination index (DI)

For the individual item power of distinguishing good students from poor ones, seventeen i tems were listed as having good discriminating power,( DI ranging from .31- .54), fifteen with fair ability (DI .12- .29) and eight with poor power (DI -.19-.9). Among the items with low DI, three were observed to have a negative value which indicates that students who did not score on the test got these items right while those who did well on the test did not get them right (Cohen et al., 2013; ScorePak®, 2005). These items need to be attended to immediately and decisions must be made as to revise or

discard them.

7.3.4. Using FI, DI and DE values in evaluating the effectiveness of test items.

Most advocates of item analysis recommend test developers to refer to both the FI and DI values of a test item to determine its quality. In this study, the values of these two indices generated by the Moodle Quiz Module were used to examine the strength and weakness of each of the 40 test items. For example, based on its FI and Di values, test item number 1 is considered as an easy item but has a poor discriminating power. Figure 1 shows the relevant statistics gathered from the item analysis procedure performed on test item number 1. As shown in the Figure, item 1 has a high FI value but a low DI value . These values indicate that the item was an easy one to answer for the majority of students who took it; however, it had a poor ability to discriminate good students from poor ones. It is a good idea to retain this item after improving its DI value because having a few easy items in a

test will provide motivation to especially weak students to continue attempting the test.

Figure 1.

e-Proceeding of the Global Conference on Technology in Language Learning 2015 (e -ISBN 978-967-

0792-03-3). Meliá Hotel Kuala Lumpur, Malaysia. Organized by http://worldconferences.net/home/

1ST GLOBAL CONFERENCE ON TECHNOLOGY IN LANGUAGE LEARNING 2015 (GLIT2015) JUNE 2015

197

To improve item 1, we need to analyse its distractor efficiency (DE) value and the students’ responses to the item. This data can be obtained from Moodle quiz structure analysis. Like its DI value, item 1 also had a low DE of 11.36%. This implies that many of its distractors were not functioning well. According to Hingorjo and Jaleel (2012), a good test item, which has only one or without non-functioning distractor at all, usually has DE within the range of 75-100%, while a poor item has DE of less than 5%. Figure 2 presents the Moodle analysis of students’ responses to test item 1. As indicated in the analysis, four of the options were poor distractors. The first option only managed to attract 9 students while both the second and fourth options captured the eye of only one student. These options, therefore, need to be further improved so that they can attract more

students which in turn, can help to increase the DI value of the item.

Figure 2.

Another good example is test item number 15 which is considered as a good item to keep in the test bank for future use. Figure 3 shows the relevant statistics for the item. The FI value of this item is within the desirable range (of .30-.70) and its DI value exceeds .31 indicating that the item has a good discriminating power. However, the item only has DE of 53.61%, which is above 5% but still

lower than the desired DE of 75%.

Figure 3

e-Proceeding of the Global Conference on Technology in Language Learning 2015 (e -ISBN 978-967-

0792-03-3). Meliá Hotel Kuala Lumpur, Malaysia. Organized by http://worldconferences.net/home/

1ST GLOBAL CONFERENCE ON TECHNOLOGY IN LANGUAGE LEARNING 2015 (GLIT2015) JUNE 2015

198

Detailed analysis of responses for test item 15, as presented in Figure 4, clearly shows that all options for this item managed to attract students to them which accounts for its high DI value. This item is challenging and is of a high level of Bloom’s taxonomy because it requires students to read each of the options carefully and figure out their relationships. However, the item still has its weakness; the second option only attracted one student and should therefore be improved to

attract more students.

Figure 4

Test developers should be wary of items that have negative DI values because such items indicate that many weak students managed to select the correct option compared to good students (Kaplan & Saccuzzo, 2009; ScorePak®, 2005 ). The negative DI value could either be due to a badly constructed item or wrongly keyed item. If the item is found to be badly constructed on closer examination, it should be eliminated from the test. However, if the value is due to test developer’s carelessness in keying the correct response option, it should then be revised, especially if its DI value is excellent. Figure 5 highlights the quiz structure analysis for item 23, which is one of the items that has a negative DI (item 23). The low FI value of about .34 reflects that the item is quite a difficult

one.

Figure 5

Nevertheless, as indicated by its negative DI, the item failed to discriminate students who took the test according to their ability. Upon examining the analysis of responses for the item (given in Figure 6), the researchers realized that one of the test distractors or opti ons attracted only one student. In addition, the item was not that clearly stated; preposition ‘of’ should have been inserted after the word ‘objective’. Thus the item definitely needs revision.

e-Proceeding of the Global Conference on Technology in Language Learning 2015 (e -ISBN 978-967-

0792-03-3). Meliá Hotel Kuala Lumpur, Malaysia. Organized by http://worldconferences.net/home/

1ST GLOBAL CONFERENCE ON TECHNOLOGY IN LANGUAGE LEARNING 2015 (GLIT2015) JUNE 2015

199

Figure 6

9. Conclusion and Future Recommendation

This paper has demonstrated that Moodle is indeed a robust LMS. It enables educators, particularly language instructors, to develop and administer a paperless test to a big number of students without having to score the test manually. It also generates statistical data that can assist instructors in determining the effectiveness of their test items and guide them as to how to improve the items. It is clearly a useful tool for individuals who are keen to improve their test construction skills so that

they can enjoy their teaching profession better.

However, the administration of online assessment in Moodle has not been without problem. Preparing Multiple-Choice questions is ‘costly in time’ as this type of assessment is based on implementing an accurate scale of testing students’ knowledge (Novo-Corti, Varela-Candamio, & Ramil-Díaz, 2013). Moreover, educators should not rely solely on the item analysis results generated by Moodle quiz module because the data are tentative and can be influenced by many variables such as the type and size of test takers, and the instructional procedures used (ScorePak®, 2005). Instructors may also frequently find themselves in a situation in which they need to exercise their judgment regarding what kind of test items to include in their tests to ensure that the learning outcomes are successfully and effectively measured. Thus, developing and conducting online assessments does not only demand educators to have expertise in test development procedures but also require them to hold appropriate post-test activities to determine the effectiveness of their tests in assessing students’ performance, which should be closely aligned with the course learning objectives. Developing literacy in assessment should certainly be a constant and deliberate effort of

all educators to ensure that they have sound knowledge and skills in testing and validation.

References

Abdul-Rashid, Rosyati. (1996). The development and validation of an institutional reading placement

test. Retrospective Theses and Dissertations. Paper 207. http://lib.dr.iastate.edu/rtd/207

Avison, D., Lau, F., Myers, M., & Nielsen, P.A.,(1999),Action Research, In Communications of the

ACM, 42:1, 94-97, Retrieved from:

http://www.researchgate.net/publication/220422055_Action_Research/file/60b7d51426bc9eb0cd.

pdf

e-Proceeding of the Global Conference on Technology in Language Learning 2015 (e -ISBN 978-967-

0792-03-3). Meliá Hotel Kuala Lumpur, Malaysia. Organized by http://worldconferences.net/home/

1ST GLOBAL CONFERENCE ON TECHNOLOGY IN LANGUAGE LEARNING 2015 (GLIT2015) JUNE 2015

200

Aydin, C.C., & Cagiltay, N.E., (2007), How Assessment System of an Open Source Learning Management System can be Integrated to a remote Laboratory Application? Problems and Solutions, In The 18th Annual IEEE International Symposium on Personal, Indoor and Mobile Radio Communication (PIMRC’07), 1-3, Retrieved from: http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=4394850 Aydin, C.C., & Tirkes, G., (2010), Open source learning management systems in e -learning and Moodle, Education Engineering (EDUCON), 2010 IEEE , pp.593,600, 14-16 April 2010. Retrieved from: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5492522&isnumber=5492336 Brown, H. D. (2004). Language Assessment: Principles and Classroom Practices. New York, United

States of America: Pearson Education.

Chung, C.H., Pasquini, L.A., & Koh, C.E., (2013). Web-based Learning Management System Design

Considerations for Higher Education. Learning and Performance Quarterly, 1(4), 2013 (pp. 24-37).

Retrieved from: http://www.sageperformance.com/ojs/index.php/LPQ/article/view/41/pdf_1

Cohen, R. J., Swerdlik, M. E., & Sturman, E. D. (n.d.). Psychological Testing and Assessment: An

Introduction to Tests and Measurement (8th ed.). New York, United States of America: Mc Graw-Hill.

Costa, C., Alvelos, H., & Teixeira, L. (2012). The Use of Moodle e-learning Platform: A Study in a

Portuguese University. In Procedia Technology, 5, 334–343. Retrieved from:

http://www.sciencedirect.com/science/article/pii/S2212017312004689

Ennis, R.H.,Millman,J. & Tomko, T.N.,(1985), Cornell Critical Thinking tests Level X and Level Z

manual, (1st ed.). CA:Midwest Publication.

Hingorjo, M.R., & Jaleel, F. (2012). Analysis of one-best MCQs: the difficulty index, discrimination

index and distractor efficiency. J Pak Med Association, 62(2):142-147.

Izard,J. (2005). Module 7: Trial testing and item analysis in test construction. Retrieved from: http://www.iiep.unesco.org/fileadmin/user_upload/Cap_Dev_Training/Training_Materials/Quality/

Qu_Mod7.pdf

Jou, M., & Wang, J. (2013). Investigation of effects of virtual reality environments on learning

performance of technical skills. In Computers in Human Behavior, 29(2), 433–438. Retrieved from:

http://www.sciencedirect.com/science/article/pii/S0747563212001240

JISC. (2010). Effective Assessment in a Digital Age: A guide to technology-enhanced assessment and

feedback. Retrieved from JISC e-Learning programme UK:

http://www.jisc.ac.uk/media/documents/programmes/elearning/digiassass_eada.pdf

Kaplan, R. M., & Saccuzzo, D. P. (2009). Psychological Testing: Principles, Applications and Issues (7th

ed.). CA, United States of America: Wadsworth, Cengage Learning.

Lim Hooi Lian, Wun Thiam Yew, & Chew Cheng Meng. (2014). Enhancing Malaysian teachers'

assessment literacy. International Education Studies, 7 (10): 74-81.

e-Proceeding of the Global Conference on Technology in Language Learning 2015 (e -ISBN 978-967-

0792-03-3). Meliá Hotel Kuala Lumpur, Malaysia. Organized by http://worldconferences.net/home/

1ST GLOBAL CONFERENCE ON TECHNOLOGY IN LANGUAGE LEARNING 2015 (GLIT2015) JUNE 2015

201

Lucas, Stephen, E. (2010). The Art of Public Speaking,(11th edition), New York: McGraw-Hill

Malini Ganapathy & Sarjit Kaur (2014). ESL students' perceptions of the use of higher order thinking

skills in English language writing. Advances in Language and Literacy Studies 5 (5): 79-87.

Malaysia Ministry of Higher Education (MOHE). (2007). Higher Education Action Plan 2007-2010. Ministry of Education Malaysia (2013). Malaysia Education Blueprint 2013-2025: Preschool to postsecondary education. Monarch Media, (2010), Open-Source Learning Management Systems: Sakai and Moodle, Monarch

Media, Inc., Business White Paper, Retrieved from:

http://www.monarchmedia.com/enewsletter_2010-3/open-source-lms-sakai-and-moodle.pdf

Novo-Corti, I., Varela-Candamio, L., & Ramil-Díaz, M. (2013). E-learning and face to face mixed

methodology: Evaluating effectiveness of e-learning and perceived satisfaction for a microeconomic

course using the Moodle platform. In Computers in Human Behavior, 29(2), 410–415. Retrieved

from: http://www.sciencedirect.com/science/article/pii/S0747563212001562

Rajendran Nagappan. (2001). The teaching of higher-order thinking skills in Malaysia. Journal of

Southeast Asia Education, 2(1).

Shaharom Abdullah,(2004), Reading comprehension test as a measurement of critical thinking

ability. (Unpublished Ph.D dissertation). Universiti Pendidikan Sultan Idris, Malaysia.

ScorePak® (2005). Understanding Item Analysis Reports. Retrieved from:

http://www.washington.edu/oea/services/scanning_scoring/scoring/item_analysis.html

The Measurement and Evaluation Center, University of Texas at Austin. (2003). Test Item Analysis

and Decision Making. Retrieved

from: http://www.tutzauer.com/TLC/Test_construction_handout.pdf

University of Washington (2005). Item Analysis. Retrieved from:

http://www.washington.edu/oea/pdfs/resources/item_analysis.pdf

Whelan, R. & Bhartu, D., (2007). Factors in the deployment of a learning management system at the

University of the South Pacific. In ICT : providing choices for learners and learning. Retrieved from:

http://www.ascilite.org.au/conferences/singapore07/procs/whelan.pdf

Wright, P. & Wright, G., (2011), Using Moodle to Enhance Thai Language Learning: Instructor and

Learner Perspectives, In The Journal of Kanda University of International Studies, 23, 375-398,

Retrieved from: https://www.kandagaigo.ac.jp/kuis/about/bulletin/jp/023/pdf/019.pdf

Zou, J., Liu, Q., & Yang, Z. (2012). Development of a Moodle course for schoolchildren’s table tennis

learning based on Competence Motivation Theory: Its effectiveness in comparison to traditional

training method. In Computers & Education, 59(2), 294–303. Retrieved from:

http://www.sciencedirect.com/science/article/pii/S0360131512000097

e-Proceeding of the Global Conference on Technology in Language Learning 2015 (e -ISBN 978-967-

0792-03-3). Meliá Hotel Kuala Lumpur, Malaysia. Organized by http://worldconferences.net/home/

1ST GLOBAL CONFERENCE ON TECHNOLOGY IN LANGUAGE LEARNING 2015 (GLIT2015) JUNE 2015

202