Cordova_CTT and IRT

8/9/2019 Cordova_CTT and IRT

1/22

Evaluation of FLA Test:Evaluation of FLA Test:

A Comparison BetweenA Comparison BetweenCTT and IRTCTT and IRT

Maria Theresa P. CordovaMaria Theresa P. Cordova


2/22

Review of LiteratureReview of Literature

Classical Test Theory (CTT)Classical Test Theory (CTT)

Item Response Theory (IRT)Item Response Theory (IRT)

Front Line Ambassadors (FLA)Front Line Ambassadors (FLA)


3/22

Classical Test TheoryClassical Test Theory

Collection of mathematical concepts thatCollection of mathematical concepts thatformalize and clarify certain questions aboutformalize and clarify certain questions aboutconstructing and using tests, and then provideconstructing and using tests, and then provide

methods for answering them (McDonald, 1999)methods for answering them (McDonald, 1999) Most popular choice because of the ease ofMost popular choice because of the ease of

use and adaptability in analyzing almost alluse and adaptability in analyzing almost alldifferent types of tests (Hambelton & Plake,different types of tests (Hambelton & Plake,1995).1995).

Each individual has a true score which wouldEach individual has a true score which wouldbe obtained if there were no errors inbe obtained if there were no errors inmeasurementmeasurement


4/22

Classical Test TheoryClassical Test Theory

Framework: item difficulty, itemFramework: item difficulty, item--testtest

correlation, reliability coefficient andcorrelation, reliability coefficient and

standard error of measurementstandard error of measurement

Magno (2010) discussed that HambeltonMagno (2010) discussed that Hambelton

argued that one limitation of CTT is itargued that one limitation of CTT is it

needs a large sampling because of theneeds a large sampling because of thestatistics that describe items andstatistics that describe items and

questions are sample dependent.questions are sample dependent.


5/22

Item Response TheoryItem Response Theory

The model proposes a mathematicalThe model proposes a mathematicalrelationship between a persons ability,relationship between a persons ability,

the difficulty of the task and thethe difficulty of the task and theprobability of the person succeeding onprobability of the person succeeding onthat task (Wright & Mok, 2000).that task (Wright & Mok, 2000).

Known to be sample free or sampleKnown to be sample free or sample

independent by looking at theindependent by looking at thecharacteristics of the item or itemcharacteristics of the item or itemparameters.parameters.


6/22

Item Response TheoryItem Response Theory

Magno (2010) discussed that IRT is aMagno (2010) discussed that IRT is a

probabilistic unidimensional model whichprobabilistic unidimensional model which

asserts thatasserts that(1) the easier the question, the more likely the(1) the easier the question, the more likely the

student will respond correctly to it.student will respond correctly to it.

(2) the more able the student, the more likely(2) the more able the student, the more likely

he will be able to answer the question ashe will be able to answer the question ascompared to a less able student.compared to a less able student.


7/22

Front Line AmbassadorsFront Line Ambassadors

Training Program

Mystery Shopping Program

Rewards Program

Deployment


8/22

MethodMethod

ParticipantsParticipants

InstrumentInstrument

Data AnalysisData Analysis


9/22

ParticipantsParticipants

53 Front Line Ambassadors (FLA) from53 Front Line Ambassadors (FLA) from

Western Union branches in Metro Manila.Western Union branches in Metro Manila.

Attended oneAttended one--day Front Lineday Front Line

Ambassador Training before taking theAmbassador Training before taking the

FLA Test.FLA Test.

Perform to To Receive Money and ToPerform to To Receive Money and ToSend Money over the counter with anSend Money over the counter with an

average of 20 transactions daily.average of 20 transactions daily.


10/22

Instrument: FLA TestInstrument: FLA Test

Review of existing FLA Test and OnlineReview of existing FLA Test and Online

FLA TestFLA Test


11/22


50 items; multiple choice50 items; multiple choice

Components: Western Union History,Components: Western Union History,

Standard Procedures, Agent SecurityStandard Procedures, Agent Securityand Point of Sale (POS) Audit.and Point of Sale (POS) Audit.

Cognitive Skills: recognition (20 items),Cognitive Skills: recognition (20 items),understanding (10 items), analyzing (10understanding (10 items), analyzing (10items) and applying (10 items).items) and applying (10 items).

Face and construct validityFace and construct validity


12/22


Objectives:Objectives:

(1) demonstrate understanding of the origin of(1) demonstrate understanding of the origin ofmoney transfer businessmoney transfer business

(2) demonstrate understanding of how to serve(2) demonstrate understanding of how to servethe customer by following the standardthe customer by following the standardproceduresprocedures

(3) demonstrate understanding of compliance(3) demonstrate understanding of compliance

by observing the security policies andby observing the security policies andguidelinesguidelines

(4) demonstrate understanding of providing(4) demonstrate understanding of providingcustomer experience service.customer experience service.


13/22

Data Analysis: CTTData Analysis: CTT

Item difficulty was done by identifying the upper 27%Item difficulty was done by identifying the upper 27%and lower 27% of the high and low scoring groups. Theand lower 27% of the high and low scoring groups. Thetotal number of respondents needed is 28 distributedtotal number of respondents needed is 28 distributed

equally to upper and lower categories.equally to upper and lower categories. The correct and incorrect responses of the two groupsThe correct and incorrect responses of the two groups

were tabulated in Excel format.were tabulated in Excel format.

Correct answers were encoded as 1 and incorrectCorrect answers were encoded as 1 and incorrectanswers were coded as 0.answers were coded as 0.

Pearson product moment correlation was then used toPearson product moment correlation was then used todetermine the relationship between the variables beingdetermine the relationship between the variables beingstudied.studied.

Reliability and standard error of measurement (SEM)Reliability and standard error of measurement (SEM)were obtained using Statistica.were obtained using Statistica.


14/22

Data Analysis: IRTData Analysis: IRT

Item response theory analysis used theItem response theory analysis used the

Winsteps software to obtain the personWinsteps software to obtain the person

and ability scores for separation,and ability scores for separation,reliability, RMSE and SE.reliability, RMSE and SE.


15/22

ResultsResults

Item AnalysisItem Analysis

ReliabilityReliability

Standard Error of MeasurementStandard Error of Measurement


16/22

Item Analysis: DifficultyItem Analysis: Difficulty

CTT: 82% or 41 items were under easy level and 18%CTT: 82% or 41 items were under easy level and 18%or 9 items were grouped in the average level. No itemor 9 items were grouped in the average level. No itemwas categorized in the difficult level.was categorized in the difficult level.

IRT: 70% or 36 items were in the easy level, 4% or 2IRT: 70% or 36 items were in the easy level, 4% or 2items in the average level and 22% were considered initems in the average level and 22% were considered inthe difficult level.the difficult level.

Test was fairly an easy test because majority of theTest was fairly an easy test because majority of therespondents got the most of the items correctly.respondents got the most of the items correctly.

Matched items: 70%Matched items: 70%

Mismatched items: 30%Mismatched items: 30%

CTT and IRT item difficulty estimates: significant butCTT and IRT item difficulty estimates: significant butlow correlationlow correlation rr=.32=.32


17/22

Item Analysis: ItemItem Analysis: Item

DiscriminationDiscrimination CTT: 82% or 41 poor items and 18% or 9 very goodCTT: 82% or 41 poor items and 18% or 9 very good

items. It means that majority of the lowitems. It means that majority of the low--scoring testscoring test--takers can equally answer the items correctly astakers can equally answer the items correctly ascontrasted with the highcontrasted with the high--scoring test takers.scoring test takers.

IRT: 18% 9 very good items (a value which is greaterIRT: 18% 9 very good items (a value which is greaterthan 1 means that the item discriminates between highthan 1 means that the item discriminates between highand low performers more than expected for an item ofand low performers more than expected for an item ofthis difficulty)this difficulty)

Matched very good items: 88% or 8 items matchedMatched very good items: 88% or 8 items matched

Mismatched: 12% or 2 items mismatched.Mismatched: 12% or 2 items mismatched.

CTT and IRT Item discrimination indices, theCTT and IRT Item discrimination indices, thecorrelationcorrelation rr= .853 which means that both methods= .853 which means that both methodshave very high relationship.have very high relationship.


18/22

ReliabilityReliability

Reliability FLA Test

CTTCronbachs alpha

Split half

.946

.972

IRTPerson Reliability

Item Reliability

.50

.66


19/22

Summary ofMeasureSummary ofMeasure

Persons and ItemsPersons and Items

As presented in table 3, the person separation value is 1.00 which means that there isAs presented in table 3, the person separation value is 1.00 which means that there isonly one ability group that can be generated from the sample.only one ability group that can be generated from the sample.

For the item separation, the value of 1.38 means that the items in the test can still beFor the item separation, the value of 1.38 means that the items in the test can still beclassified into subgroups.classified into subgroups.

Review the table of specifications to identify possible classification of the test itemsReview the table of specifications to identify possible classification of the test itemseither by content or competency.either by content or competency.

RMSE values for person (.80) and (.73) indicated that the data do not fit the expectedRMSE values for person (.80) and (.73) indicated that the data do not fit the expectedability and test difficulty.ability and test difficulty.

For the examination of fit, the average INFIT and OUTFIT statistics for person and itemsFor the examination of fit, the average INFIT and OUTFIT statistics for person and itemsindicated goodness of fit because the value was less than 1.5indicated goodness of fit because the value was less than 1.5


20/22

Standard Error ofStandard Error of

MeasurementMeasurement CTT: .03CTT: .03

IRT: .22IRT: .22


21/22

DiscussionDiscussion

Overall results for FLA Test indicated that the data do not fit theOverall results for FLA Test indicated that the data do not fit theexpected ability and test difficulty.expected ability and test difficulty.

CTT and IRT: easy levelCTT and IRT: easy level

Item difficulty estimates: low relationship.Item difficulty estimates: low relationship.

Indices of discrimination: very high relationship.Indices of discrimination: very high relationship. Reliability scores revealed high internal consistencies but stillReliability scores revealed high internal consistencies but still

needs revision to obtain better results for item analysis.needs revision to obtain better results for item analysis.

RMSE showed that the FLA Test can be categorized intoRMSE showed that the FLA Test can be categorized intosubgroups. However, it is recommended that unidimensionality willsubgroups. However, it is recommended that unidimensionality willbe validated by conducting principal component analysis.be validated by conducting principal component analysis.

Sample must include other branch locations across theSample must include other branch locations across thePhilippines.Philippines.


22/22

Thank you.Thank you.

Documents

Cordova_CTT and IRT