0.1 Rasch Workshop Booklet Structure of Measurement

Embed Size (px)

Citation preview

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    1/48

    Workshop on

    Rasch Analysis:

    A Practical Guide toWinSteps

    Dr.Azrilah Abdul Aziz

    Mohd Saidfudin Masodi

    February 2010

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    2/48

    Table of Content

    1.0 Rasch Course !er!ie"............................................................

    2.0 #ro$ra% u&line....................................................................... Rasch Analysis 2'Days Workshop u&line

    (.0 Keynote :

    3.1 Modern Measurement Paradigm: Easier to read and betteranalysis using Rasch-based Approach...............................) '1)

    3.2 Deeloping an !nstrument "onstruct Made #imple.........1* '2+

    4. Wor!shop:

    WinS&eps,- A #rac&ical uide

    ' #etting up "ontrol $ile.........................................................2%- 2&

    ' 'asic Rasch Analysis............................................................2( -3)

    - #erson'/&e% Dis&ribu&ion Map Cons&ruc& and in&erpre&a&ion - #ar&ial Credi& Ra&in$ Analysis /&e% alidi&y

    - #erson Response alidi&y Analysis

    ' "omputation o* Probabilistic +i,elihood Estimate..........3% - 3&

    ". Appendices........................................................#$%4&

    .1 Sa%ple Size and /&e% Calibra&ion 3or #erson Measure4

    S&abili&y.2 Wha& do /n5& and u&5&6 Mean's7uare and S&andardized%ean8

    .( Ra&in$ Scale /ns&ru%en& 9uali&y Cri&eria

    .+ :;a%ple of #erson /&e% Map re'dis&ribu&ion

    2

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    3/48

    Rasch Course '(er(ie)

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    4/48

    Key *ote Spea!er-

    Modern measurement paradigm: The Rasch ApproachMohd.Saidfudin Masodi /RCA ?on@ /SE001 9MS ead Assessor

    Pro+ra, 'utline

    A&&endees "ill learn:

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    5/48

    12-(0'2-1. 1unch

    Afternoon.2-1. uidance &hrou$h a Rasch Analysis s&ar&in$ "i&h an:;cel 5le and usin$ insteps sof&"are6 %orein&erpre&a&ion of ou&pu&.

    #rac&ice usin$ instepsMinistep sof&"are in analyzin$Readiness to "hange da&a &ha& is included "i&h &hesof&"are.

    (-+. >reak

    +-00'-(0. #ar&icipan&s "ill be $i!en an :;cel da&a 5le and"ill "ri&e &heir o"n con&rol 5les and run &he% on &hesa%ple da&a "i&h hands'on help fro% facili&a&ors.

    -ay 2.3ands%on Analysis Practice -iscussion

    /ornin+. $:#a,#reli%inary brie5n$.

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    6/48

    par&icipan&s should ha!e a "orkin$ fa%iliari&y "i&h RaschMeasure%en& and Analysis concep&s6 runnin$ &he instepspro$ra%6 and in&erpre&in$ instepsou&pu&.

    -(0p%. 5nd of Course.

    J

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    7/48

    3.1 Rasch Model:The Modern Measurement Paradigm

    Dr.Azrilah Abdul Aziz

    Mohd Saidfudin Masodi

    For many decades, we have made ourselves assured that it is almost

    impossible to have quantitative type of data in social sciences. We deludedourselves into having only descriptive type of research in social sciences. Over

    the years, social scientist became aware that there should be more than

    descriptive findings, more to only the typical reporting by median and

    percentages. There should be a way in presenting the findings in more meaningful

    ways. It would be more interesting to clearly see the inherent relationship between

    the human and the observable actions being assessed. There should be more

    than only reporting on the association and correlations values, rather, it should

    provide a clearer picture of what is happening between the human and the

    observable actions, sort of a hierarchical relationship between the two.

    Rasch measurement model has made it possible for social scientist to conduct

    calibrated measurement where human is the focused of attention. In our day to

    day lives, we rely on standard measurement system to measure and cut timber,

    buy lengths of cloth, assemble the correct amounts of ingredients to bae a cae,

    and to administer appropriate doses of medicine to ailing relatives. !imilarly, we

    can conduct educational research, treating and analysing data from survey or

    psychological investigation in the same manner as we do with the standard

    measurement system.

    Rasch measurement model is a way to mae sense of the world. "#perience is

    continuous, but once we notice the e#perience, it becomes discrete. We sense

    )

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    8/48

    happiness when receiving flowers during convocation. When we distinguish

    between the different ind of happiness$ not very happy, happy, happier, and

    happiest, that moment of observations become discrete. Then we choose

    dimension according to its utility of the happy sensations$ for e#ample a flower, a

    bouquet of flowers, a bouquet of flowers with chocolate, and a bouquet of flowers

    with chocolate and a little teddy bear with it. To mae our observations more

    meaningful, we represent the observations with score in the form of$

    Not very happy, happy, happier, happiest

    Which we score as observations% x&1, 2, 3,

    It is common that we tae the raw score of each of the sensation to indicate

    sort of a measurement for the dimensions. 'owever, raw score are only

    indications of a possible measure. Raw score cannot be the measures sought

    because in their raw state, they have little inferential value. To develop metric

    meaning, the counts must be incorporated into stochastic process which

    constructs inferential stability.

    ( survey item for e#ample is answered by ), *, +, or , with each correspond

    some ind of phrases such as -strongly disagree to -strongly agree. Responses

    from the survey are of ordinal variables and with it we can determine median,

    percentile ran and determine the relationship between the two characteristics by

    means of !pearman/s ran order correlation. That is as far as we can go with

    ordinal data. It is by no mean a measurement.

    The separation of the rating values assigned ), *, +, or is not of equal

    interval and therefore does not give a scale which has been construed all the

    while. The termed scale for such given rating has submerged the truth of

    measurement it does not possess the characteristics what is deemed to be a

    scale by standard definition in physical science. The ignorance of standard

    *

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    9/48

    protocol has lead us into a situation which warrants us to review measurement as

    what is normally perceived in human science.

    !imilarly, the practise of raw counts may give the impression that they are

    interval measure of e#perience. 0ut this is always an illusion. From our

    observations on the sensation of happy, the assignment of the number labels *, +,

    , 1 to the options of -not very happy, -happy, -happier, -happiest, does not

    mae these numerals become equally distanced measures. If the category labels

    are not equally distances, than we cannot provide legitimate processing for these

    non2interval category labels, including the mean and standard deviation.

    (part from that, missing data is a common issue in survey data. 3issing data

    may result from oversight, non2compliance or from incidental interference. The

    purpose of we conducting a research is to use e#isting information to mae

    inferences about what is still unnown, then missing data are of importance too. (

    useful measurement model for constructing inference from observation must be

    unaffected by these missing data. The measurement model must also enableprecision estimation of the inference and able to provide detection and evaluation

    of discrepancy between an observation and e#pectation. 'ence raw counts

    cannot be relied upon to serve as measures.

    (ccuracy of a measure may be achieved through replication. When similar

    results occur repeatedly, we are confident that similar result will occur again in

    future. 'owever, replication does not guarantee accuracy of measure unless the

    testing instrument is used to operate for a specific purpose or target and more

    importantly each measurement shall be independent of each other. Irrespective of

    the person, a thermometer reading is always accepted independently. On the

    same premise, a measurement instrument thus developed must be able to

    behave equally functional. ( survey conducted or an e#amination being carried

    out on a sub4ect matter rightfully shall not get affected by the respondent who is

    taing it.

    E

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    10/48

    Thus, in order to construct inferences from observation, the measurement

    model must

    5a6 be able to produce linear measures

    5b6 to overcome missing data

    5c6 to give estimates of precision

    5d6 able to detect misfits or outliers

    5e6 and the parameters of the ob4ect being measured and the

    measurement instrument must be separable or independent of each

    other.

    Only Rasch measurement models solve these problems. Rasch measurement

    model or in short referred to as Rasch model helps in constructing a scale based

    on a set of survey items.

    7et/s start discussing on Rasch measurement model wors using simpleoutcomes. 0ac to our e#ample of happy sensation of receiving a bouquet of

    flower on a convocation day, assume a guy goes around giving a bouquet of

    flower to a successful lady graduate. On the first attempt, the observation of a lady

    graduate to respond is a 8)%8) chance of liing the giving. If the guy were to give

    the bouquet of flower to *) other lady graduates, he may receive a respond of si#

    596 liings against four 516 dislies. This can be e#pressed as a probabilistic odd of

    9)%1). We may get some other outcomes in the order of perhaps :)%) or on the

    opposite scale +)%;) perhaps.

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    11/48

    This order of probabilistic turn of event can be represented in the line diagram as

    in Figure *.

    !"#$%& 1. 'robabilisti( line dia)ra*

    >sing a scientific calculator, the probabilistic odd can be converted into a

    series of numerical values. 'owever, we noticed that these numerical values are

    much clustered towards the left most end of the ruler when we attempt to position

    it on an equal interval scale. This is a prerequisite where these numerical values

    shall e#hibit distribution of equal distance being one of the criteria of a

    measurement scale. It was clearly noted that by converting the odd ratio into

    numerical value does not yield a scale of equal interval. Therefore, it cannot

    behave as a ruler of measurement.

    This is depicted in Figure +.

    !"#$%& 2. Nu*eri(al s(ale

    In order to achieve an equal interval scale, we can introduce logarithm of the

    odd probabilistic value. 3aintaining the same odd probabilistic ruler as in Figure *,

    starting with ).)* to *)), we can create an equal interval separation between the

    log odds units on the line, hence the measurement ruler with the logitunit. This

    can be verified by computing the value of log ).)* equals to 2+.)$ value of log ).*

    equals to 2*$ value of log * equals to ). Figure shows the newly established logit

    11

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    12/48

    ruler as a scale with equal interval separation. It is 4ust lie looing at a

    thermometer with =)/, as water being ice and *)) as boiling point whilst the

    negative e#treme end as 2+:o?, the point where all atoms of any element come

    to a standstill.

    !"#$%& 3. +o) odds unit ruler

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    13/48

    !"#$%& . bservation of responses fro* survey

    While this raw data matri# is all the observation we have, it is of limited utility.

    "ven though it contains everything we could observe, it does not help to predict

    what will happen in future. The raw score does not allow us to mae any useful

    inference or able to draw any conclusion about the items or about the interactions

    between the items and the persons. It only gives an order of preference. Raw

    score is therefore ordinal. It lacs depth to give any meaningful interpretation of

    the data obtained.

    For e#ample if the survey is to differentiate the ability of person, we cannot

    conclude which respondents are more able simply by looing at the raw score. If

    we were to accept the respondent with the highest raw score to be more able

    compared to those with lower raw score, it will be complicated to distinguish

    respondents with same raw score. We do not have any basis to differentiate them.

    We must build a useful e#pectation of whether the respondent will agree or

    succeed on the ne#t item. To now about the ability of the respondent in agreeing,

    we need to now the degree of difficulty of an item endorsement in which the

    respondent has to attempt.

    The data matri# can be arranged so that the items are ordered from least to

    most difficult and the persons are ordered from least to most able, as in Figure 8.

    The higher up the table one goes, the more able the persons. The further right

    across the table one goes, the more difficult the items. This type of arrangement is

    termed !calogram or @uttman matri#.

    1(

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    14/48

    !"#$%& -. S(alo)ra* of responses fro* survey

    From Figure 8, we can deduce that persons who are more able i.e., those located

    towards the top of the table would have a greater lielihood to respond to all the

    items as =*/. On the other hand, the less able person would have a greater

    lielihood to respond only to easy items and may find it difficult to respond to

    difficult items. This will generate some ind of pattern of observable responses. In

    general, we can estimate that the scalogram will have most respond of -* on the

    upper portion of the diagonal line, and respond of -) on the lower portion of the

    diagonal line. This is the general rule that creates a pattern of responses which is

    easy to read and clearer to understand$ ready to be scrutinised and e#plored in

    depth.

    This ind of pattern response can be described best mathematically by a

    simple logistic regression by virtue of its point of origin =)/ and with a ma#imum

    limit of =*/. This probabilistic curve is the fundamental of Rasch analysis where

    data is e#pected to fit the model. On contrary, traditional research is deterministic

    in nature where the best fit line is established based on a set of historical data

    trying to describe a past event. ?learly, Rasch has a distinct feature where it can

    predict an e#pected event. This is e#hibited in Figure 9.

    1+

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    15/48

    From this estimation too, we can mae an inference that person ( made a

    careless mistae by not responding a -*as e#pected for the easy item in Figure 9.

    7iewise, person A did a lucy guess by responding a -* on the difficult item

    when we e#pect him to respond a -) instead.

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    16/48

    difficult items 5items which are towards the right most end6. Therefore by

    calculating the ma#imum lielihood estimate of event, we can predict that person

    A would answer a -) at the missing bo# 5third item from the right for person A6.

    !imilarly, person F is e#pected to respond a -* for the second item since person

    F is less able and would find the item easy.

    !"#$%& /. 'redi(tion on responds pattern

    This pattern of responses gives the respond validity of a person to an item.!ubsequently, Rasch has enabled us to establish the construct validity of an item.

    (n item is said to be valid when it is able to discriminate between a more able

    person and a poor person. This give rise to two ma4or discussion. The first

    possibility is the person is an outfit for not meeting the e#pected outcome. Rasch

    focuses on this ind of outfit responses and attempt to find reasoned argument

    why the person does not fit the model.

    This contribute to the significant findings of a particular research. ?ontrary to

    traditional statistical practise, this ind of outfit data would have normally been

    cleaned when it is of utmost value in Rasch.

    !econdly is of more critical issue where the item construct is at stae. If the

    responses cannot discriminate the respondent$ between the able and the poor

    ones, then there is a need to re2construct the question or possibly discard the

    item. On the e#treme end, we may need to re2construct the whole survey

    1J

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    17/48

    questionnaire or e#amination paper as it is not measuring what are we supposedly

    to measure. This is reflected in Rasch analysis as Item reliability hence instrument

    construct validity. The fundamental difference between a quantitative research as

    practised in physical science is now addressed in natural science in what so

    called qualitative research..

    ?onsequently, the rating assumed at the beginning of the survey construct or

    test result grades can be verified whether it shows the e#pected pattern of

    response. This form of calibration is unique to Rasch where it can comply to

    physical science measurement requirement to calibrate the scale of

    measurement. (n instrument which is not calibrated is deemed to yield invalid

    data thus render the whole research futile. The responds pattern prediction in

    overcoming missing data is an essential feature in maing a more accurate

    statistical analysis. This is one of the Rasch principle that allows a more accurate

    analysis to be carried out. While other model treat missing data as Bero$ Rasch

    predictive power will mae the ma#imum lielihood of an event for that particular

    matri#. 0y so doing, it maes the whole data set =as though/ it is a complete dataset while the other statistical method would be short of data therefore maing it

    less accurate in computing the basic statistics of chi2square$ set aside to calculate

    the mean and standard deviation including the z2test thereof.

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    18/48

    These data if were to be put on a scatter plot would yield a sigmoidal curve 5!2

    curve6, and do not have equal interval as a prerequisite in measurement as shown

    in Figure ;.

    !"#$%& 0. he si)*oidal (urve of responses

    Without the equal interval, prediction of an event is almost impossible. In

    overcoming the issue, linear regression approach is applied in establishing a

    straight line which fits the points as best as possible.

    !"#$%& . +inear re)ression on the responses

    It is then used to mae the required predictions by maing inter2polation or

    e#tra2polation as necessary as shown in Figure *). In obtaining the best fit line

    however, there e#its differences between actual point yiand the predicted Cithat is

    on the best fit line. The difference is referred here as error e.

    1*

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    19/48

    !"#$%& 1. 4est fit line

    0y accepting the fact that there are always errors involves in the prediction

    model, the deterministic model of equation renders itself less reliable. This can

    further be resolved by transforming it into a probabilistic model y including the

    prediction error into the equation$

    y & mxD c D e 5+6

    The formulation of probabilistic model of Rasch is based on this principle%

    a person having a greater ability than another person should have the

    greater probability of solving any item of the type in question and

    similarly one item being more difficult than another means that for any

    person the probability of solving the second item is the greater one.

    5Rasch, *E9)6

    Therefore in summary, the probability of success depends on the difference

    between the ability of the person and the difficulty of the item. The Rasch 3odel

    incorporates an algorithm that e#presses the probabilistic e#pectations of an

    item=i/ andperson=n/ performances$ mathematically e#pressed as$

    'ni5xni61 7 n, i8 6e5n 9 i 8

    5381 : e 5n 9 i 8

    where%

    ni5#ni&* G n, i6 is the probability of person n

    on item i scoring a correct response 5#&*6$

    given the person ability,nand item difficulty, i .

    1E

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    20/48

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    21/48

    Su**ary

    Rasch measurement model helps to understand a little how we came to fall so

    short of our reasonable e#pectations for scientific measurement in the humansciences. Rasch measurement model provide a closest general appro#imation of

    measurement principle for the human sciences. It accomplished the five 586

    principles of a measurement model which able to provide linear equal scale,

    overcoming missing data, estimates precision, detect misfits or provide reliability

    and is replicable. Thus, by complying to all the principles, more meaningful and

    accurate inferences can be made to the data.

    This is core issue in measurement$ the meaningfulness. There are three ma4or

    aspects of meaningfulness to tae into account in measurement. These have to

    do with the constancy of the unit, interpreting the siBe of differences in measures,

    and evaluating the coherence of the units and differences.

    First, raw scores 5counts of right answers or other events, sums of ratings, or

    ranings6 do not stand for anything that adds up the way they do. (ny given raw

    score unit can be 128 times larger than another, depending on where they fall in

    the range hence lac constant separation. 3eaningful measurement demands a

    constant unit. Instrument scaling by Rasch methods provide it.

    !econd, meaningful measurement requires that we be able to say 4ust what

    any quantitative amount of difference is supposed to represent. What does a

    difference between two measures stand for in the way of what is and isn/t done at

    those two levelsH Is the difference within the range of error, and so randomH Is the

    difference many times more than the error, and so repeatedly reproducible and

    constantH 3eaningful measurement demands that we be able to mae reliable

    distinctions and only Rasch 3odel fulfil this requirement.

    (nd finally, meaningful measurement demands that the items wor together to

    measure the same thing. If reliable distinctions can be made between measures,

    what is the one thing that all of the items tap intoH If the data e#hibit a consistency

    21

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    22/48

    that is shared across items and across persons, what is the nature of that

    consistency H 3eaningful measurement posits a model of what data must loo

    lie to be interpretable and coherent, and then it evaluates data in light of that

    model. Rasch has all these specific properties as a unique model of

    measurement.

    22

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    23/48

    4. WinSteps:

    A Practical Guide toRasch Analysis

    2(

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    24/48

    6 *'T5S 7

    4. S5TT8*G 9P -ATA .prn ;815

    2+

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    25/48

    6 *'T5S 7

    4.2 S5TT8*G 9P -ATA C'*TR'1 ;815S

    2

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    26/48

    6 *'T5S 7

    4.# '9TP9T TA015S

    2J

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    27/48

    6 *'T5S 7

    4.4 P5RS'* 8T5/ -8STR809T8'* /AP

    2)

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    28/48

    6 *'T5S 7

    2*

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    29/48

    6 *'T5S 7

    4." S9//AR< STAT8ST8CS

    2E

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    30/48

    6 *'T5S 74.& PART8A1 CR5-8T =P'1

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    31/48

    6 *'T5S 7

    4.> 8T5/ P'1AR8T

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    32/48

    6 *'T5S 7

    4.$ 8CC GRAP3

    (2

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    33/48

    6 *'T5 74.? CA1C91AT8'* '; PR'0A0818ST8C

    /A@8/9/ 18K5183''- 5ST8/AT5

    ((

    'erson +o)it "te* +o)it

    'erson "te* '5!28 '5!.8 '5!8 '5!38 '5!-8 '5!.8 '5!18 Av)

    Measure *easure ite*

    F+ +.)* !pecialised *.9, ).8E ).1; ).11 ).,; ).+, ).+, ).*1 ).,9

    F9 *.81 ," *.+E ).9: ).89 ).8, ).1: ).+E ).+E ).*E ).1,

    F1 *.1 IT *.+E ).9: ).89 ).8, ).1: ).+E ).+E ).*E ).1,

    F, *.*8 Research ).E9 ).:1 ).91 ).9* ).88 ).,9 ).,9 ).+1 ).8)

    F8 ).1 "#cellence ).91 ).;) ).:* ).9; ).9+ ).11 ).11 ).,* ).8:

    F9 ).1 Filter ).+E ).;8 ).:; ).:8 ).:) ).8, ).8, ).,; ).98

    F* 2).*; ?onsultative ).+E ).;8 ).:; ).:8 ).:) ).8, ).8, ).,; ).98

    Teamwr ).+E ).;8 ).:; ).:8 ).:) ).8, ).8, ).,; ).98

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    34/48

    6 *'T5 7

    4. P5RS'* /5AS9R5

    (+

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    35/48

    6 *'T5 7

    4. 8T5/ /5AS9R5

    (

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    36/48

    INPUT: 25 Persons 20 Items MEASURED: 25 Persons 20 Items 2 CATS-----------------------------------------------------------------------GUTTMAN SCALOGRAM OF RESPONSES:Person |Item | 1 11 11 11 1211

    |20!"##1$!55"%%102$ |-------------------- 1% &11011111111111111101 '1%111%(n$ 1# &10111111111111110001 '1#221%)s2 1" &11111111111111110000 '1"221%))1 4 +11111111111100101010 '0%222%sr$ 9 +11111111111010101010 )021$%)(2 15 +11111111111011101000 '15222%)(1 25 +11111111111101010100 )25111%)s2 5 &11111011111111001000 )05121%)s1 3 +11110111111100001010 '0$212%)s2 6 +11111011110011100100 )0#121%)s1 8 +11111110011110011000 )0"1125(*2 13 +11110101101101010101 '1$121%t+%

    17 +11101111011011110000 '1!222%(*$ ! &11111011010100111000 )0!121%)s1 1 &10001101100111100111 '1221%,r$ 2$ &11111001111001000011 '2$11$5(*$ 2% &11111001010101100010 '2%121%sr% 2 &11101101011000010010 )0211$5)s2 11 &11111110000110000100 )11212%)s1 12 &11111110100000001100 )12212%(n2 22 &11110011001100001100 )2211$5)(1 1 &11111000100110000100 )01212%(*1 21 &11101100111000000010 )21212%(1 10 &11111110000000000100 )102210)s$ 20 &11110110100000010000 '20211%(n$ |--------------------

    | 1 11 11 11 1211 |20!"##1$!55"%%102$

    (J

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    37/48

    INPUT: 25 Persons 20 Items MEASURED: 25 Persons 20 Items 2 CATS $.#".2--------------------------------------------------------------------------------Person: REAL SEP.: .5 REL.: .%! ... Item: REAL SEP.: 1.2 REL.: .!

    Person STATISTICS: MISFIT ORDER------------------------------------------------------------------------------------------------|ENTR/ TOTAL MODEL| INFIT | OUTFIT |PT-MEASURE |EACT MATC| ||NUMER SCORE COUNT MEASURE S.E. |MNS3 4STD|MNS3 4STD|CORR. EP.| OS EP| Person ||------------------------------------&----------&----------&-----------&-----------&-----------|| 14 18 20 2.58 .78|1.16 .5|5.94 2.6|A-.03 .2#| ".5 ".5| '1%111%(n$|| 19 12 20 .42 .52|1.95 3.6|2.84 3.9|B-.19 .%#| $#." !0.| '1221%,r$|| 16 16 20 1.64 .61|1.11 .4|2.72 2.0|C .19 .$5| !$.! "0.1| '1#221%)s2|| 13 13 20 .70 .53|1.31 1.4|1.39 1.1|D .25 .%%| #$.2 !1.1| '1$121%t+%|| 2$ 12 20 .%2 .52|1.1$ .!|1.1$ .5|E .$" .%#| #".% !0.| '2$11$5(*$|| 22 10 20 -.11 .52|1.0 .5|1.02 .2|F .%5 .50| 5!. !0.0| )2211$5)(1|| 2% 11 20 .15 .52|1.0! .%| ." .0|G .%5 .%"| #$.2 !0.$| '2%121%sr%|| 2 10 20 -.11 .52|1.0! .%|1.0! .$| .%5 .50| #".% !0.0| )0211$5)s2|| 17 13 20 .70 .53|1.03 .2|1.07 .3|I .41 .%%| !$.! !1.1| '1!222%(*$|| ! 12 20 .%2 .52|1.00 .1| ."! -.$|6 .%" .%#| #".% !0.| )0!121%)s1|| 1 20 -.$" .5$| ." .0| .1 -.2|7 .5$ .51| #$.2 !0.5| )01212%(*1|| 21 20 -.$" .5$| .# -.1| .! .0|L .5$ .51| #$.2 !0.5| )21212%(1|| 3 13 20 .70 .53| .91 -.3| .96 .0|M .48 .%%| !$.! !1.1| '0$212%)s2|

    | 6 13 20 .70 .53| .95 -.2| .82 -.4|l .49 .%%| !$.! !1.1| )0#121%)s1|| 8 13 20 .70 .53| .92 -.3| .81 -.4|k .50 .%%| !$.! !1.1| )0"1125(*2|| 11 10 20 -.11 .52| .0 -.%| ."1 -.#|, .5# .50| #".% !0.0| )11212%)s1|| 12 10 20 -.11 .52| ." -.5| ."$ -.5|' .5! .50| !". !0.0| )12212%(n2|| 20 " 20 -.## .5%| ."" -.%| ."% -.%|* .5 .52| !$.! !2.!| '20211%(n$|| 9 15 20 1.30 .57| .87 -.4| .67 -.5|g .49 .$| !$.! !#.0| )021$%)(2|| 4 15 20 1.30 .57| .79 -.7| .58 -.7|f .53 .$| !$.! !#.0| '0%222%sr$|| 5 1% 20 . .55| .!! -1.0| .#" -.!|e .55 .%2| "%.2 !2.5| )05121%)s1|| 15 15 20 1.30 .57| .75 -1.0| .58 -.7|d .55 .$| "%.2 !#.0| '15222%)(1|| 10 " 20 -.## .5%| .!% -1.1| .!$ -.!|8 .## .52| "%.2 !2.!| )102210)s$|| 25 15 20 1.30 .57| .70 -1.2| .52 -.9|b .58 .$| "%.2 !#.0| )25111%)s2|| 1" 1# 20 1.#% .#1| .## -1.1| .%5 -.|9 .5! .$5| "%.2 "0.1| '1"221%))1||------------------------------------&----------&----------&-----------&-----------&-----------|| MEAN 12.% 20.0 .5" .55| ." .0|1.21 .1| | !2.0 !$.2| || S.D. 2.! .0 .! .05| .25 1.0|1.12 1.1| | 10." %.5| |

    ----------------------------------------------------------------------------------

    ()

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    38/48

    ".Appendices!.1 "ample "ize and #tem $alibration %or &erson

    'easure( "tability

    !.2 )hat do #nfit and *utfit 'ean+square and"tandardized mean,

    a. Rating "cale #nstrument -uality $riteria

    b. xample of &erson #tem 'ap Re+construction

    (*

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    39/48

    ". Sa,ple Sie and 8te, CalibrationHor Person /easureI Stability

    'ow big a sample is necessary to obtain usefully stable item calibrationsHOr how long a test is necessary to obtain usefully stable person measure estimatesH

    The Rasch model is blind to what is a person and what is an item, so the numbers arethe same.

    "ach time we calibrate a set of items on different samples of similar e#aminees, wee#pect slightly different results. In principle, as the siBe of the samples increases, thedifferences become smaller. If each sample were only + or e#aminees, results couldbe very unstable. If each sample were +,))) or ,))) e#aminees, results might be

    essentially identical, provided no other sources of error are in action. 0ut large samplesare e#pensive and time2consuming. What is the minimum sample to give useful itemcalibrations K calibrations that we can e#pect to be similar enough to maintain a usefullevel of measurement stabilityH

    Polyto,ies

    The extra concern with polytomies is that you need at least 10 observations per

    category, see, for instance, Linacre J.M. !00!" #nderstanding $asch measurement%

    &ptimi'ing rating scale category effectiveness. Journal of (pplied Measurement )%1*+10-. or Linacre J.M. 1" /nvestigating rating scale category utility. Journal of

    &utcome Measurement )%!, 10)1!!.

    &therwise the actual sample si'es could be smaller than with dichotomies because

    there is more information in each polytomous observation.

    (E

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    40/48

    Person /easure 5sti,ate Stability

    The reuirements are symmetric for the $asch model so you need as many items for a

    stable person measure as you need persons for a stable item measure. onseuently,

    )0 items administered to )0 persons with reasonable targeting and fit" should produce

    statistically stable measures.

    The first step is to clarify Ksimilar enough.K Lust as no person has a height stable towithin .)* or even .* inches, no item has a difficulty stable to within .)* or even .* logits.In fact, stability to within M. logits is the best that can be e#pected for most variables.7ee 5R3T 9%+ p.+++26 discovers that in many applications one logit changecorresponds to one grade level advance. !o when an item calibration is stable within alogit, it will be targeted at a correct grade level.

    For groups of items, Wright N Aouglas 50est Test Aesign and !elf2Tailored Testing,3"!( 3emo. *E, *E:86 report that, when calibrations deviate in a random way fromtheir optimal values, Kas test length increases above ) items, virtually no reasonable

    testing situation riss a measurement bias for the e#amineesP large enough to notice.KFor even shorter tests, measures based on item calibration with random deviations upto ).8 logits are Kfor all practical purposes free from bias.K

    Theoretically, the stability of an item calibration is its modelled standard error. For asample of < e#aminees, that is reasonably targeted at the items and that responds tothe test as intended, average item p2values are in the range ).8 to ).;:, so thatmodelled item standard errors are in the range +Qsqrt5

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    41/48

    designed pilot studies. The Table suggests other ranges. Inflate these sample siBes by*)S21)S if there are ma4or sources of unmodelled measurement disturbance, such asdifferent testing conditions or alternative curricula.

    If much larger samples are conveniently available, divide them into smaller,homogeneous samples of males, females, young, old, etc. in order to chec the stabilityof item calibrations in different measuring situations.

    /&e% Calibra&ions

    s&able "i&hin

    Con5den

    ce

    Mini%u% sa%ple size

    ran$e

    ?bes& &o poor &ar$e&in$@

    Size for %os&

    purposes

    O 1

    lo$i

    &

    EP 1J '' (J (0

    O 1

    lo$i

    &

    EEP 2) '' J1 0

    O Q

    lo$i

    &

    EP J+ '' 1++ 100

    O Q

    lo$i

    &

    EEP 10* ''

    2+(

    10

    /ohn 'ichael Linacre

    &xplanatory notes%

    *. KFor a M* logit interval this !.". is M*Q+.9 logits.K (n estimateUs standard !.". isthe modelled standard deviation of the normal distribution of the observedestimate around its KtrueK value. !uppose we want to be EES confident that theKtrueK item difficulty is within * logit of its reported estimate. Then the estimateneeds to have a standard error of *.) logits divided by +.9 or less & *Q+.9 &).;8 logits.

    +1

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    42/48

    +. KThis gives a minimum sample in the range 15+.96V < E5+.96VKWith optimum targeting of a dichotomous test, the modelled probability of eachresponse is p&).8. Then the modelled binomial variance & ).8).8 & theinformation in a response. Thus < perfectly targeted observations haveinformation < ).8 ).8 & niversity of ?hicago, *E:9

    Wright N Aouglas5*E:96 KRasch #tem 0nalysis by andK% KIn other wor we have found that whentest lengthP is greater than +), random values of item calibrationP as high as ).8) havenegligible effects on measurement.K

    Wright N Aouglas 5*E:86 Kest 3est 4esign and "elf+3ailored 3estingK% KThey allow the test

    designer to incur item discrepancies, that is item calibration errors, as large as *.). This mayappear unnecessarily generous, since it permits use of an item of difficulty +.), say, when thedesign calls for *.), but it is offered as an upper limit because we found a large area of the testdesign domain to be e#ceptionally robust with respect to independent item discrepancies.K

    Wright N !tone 5*E:E6 Kest 3est 4esign5 p.67 + 5random uncertainty of less than .8 logits ,Kreferencing 3"!( 3emo *E% 0est Test and !elf2Tailored Testing. 0en4amin A. Wright N @raham(. Aouglas, *E:8 . (lso . logits in !olving 3easurement roblems with the Rasch 3odel.Lournal of "ducational 3easurement *1 5+6 pp. E:2**9, !ummer *E:: 5and 3"!( 3emo 1+6

    "ample "ize and #tem $alibration "tability. Linacre /'.

    Rasch 'easurement 3ransactions 1669 :;9 p.827

    +2

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    43/48

    ".2 What do 8nJt and 'utJt /ean%suare andStandardied ,eanL

    These are all KfitK statistics. In a Rasch conte#t they indicate how accurately orpredictably data fit the model. Aichotomous fit statistics.olytomous fit statistics.

    "nfitmeans inlier2sensitive or information2weighted fit. This is more sensitive to thepattern of responses to items targeted on the person, and vice2versa. For e#ample, infitreports overfit for @uttman patterns, underfit for alternative curricula or idiosyncraticclinical groups. These patterns can be hard to diagnose and remedy.

    utfitmeans outlier2sensitive fit. This is more sensitive to responses to items with

    difficulty far from a person, and vice2versa. For e#ample, outfit reports overfit forimputed responses, underfit for lucy guesses and careless mistaes. These areusually easy to diagnose and remedy.

    Mean;s

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    44/48

    !tandardiBed fit statistics are usually obtained by converting the mean2square statisticsto the normally2distributed B2standardiBed ones by means of the Wilson2'ilferty cuberoot transformation.

    An(hored runs=(nchor values may not e#actly accord with the current data. To thee#tent that they donUt, fit statistics can be misleading. (nchor values that are too centralfor the current data tend to mae the data appear to fit too well. (nchor values that aretoo dispersed for the current data tend to mae the data appear noisy.

    Mean's7uare alue /%plica&ion for Measure%en&

    2.0Dis&or&s or de$rades &he %easure%en& sys&e%. May becaused by only one or &"o obser!a&ions.

    1. ' 2.0Bnproduc&i!e for cons&ruc&ion of %easure%en&6 bu& no&de$radin$.

    0. ' 1. #roduc&i!e for %easure%en&.

    0.ess produc&i!e for %easure%en&6 bu& no& de$radin$.May produce %isleadin$ly hi$h reliabili&y and separa&ioncoecien&s.

    !tandardiBed Jalue Implication for 3easurement

    T (

    Da&a !ery une;pec&ed if &hey 5& &he %odel?perfec&ly@6 so &hey probably do no&. >u&6 "i&hlar$e sa%ple size6 subs&an&i!e %is5& %ay bes%all.

    2.0 ' 2.E Da&a no&iceably unpredic&able.

    '1.E ' 1.E Da&a ha!e reasonable predic&abili&y.

    U '2Da&a are &oo predic&able. &her di%ensions %aybe cons&rainin$ &he response pa&&erns.

    )hat do #nfit and *utfit 'ean+square and "tandardized mean, Linacre /'. ? Rasch 'easurement3ransactions 2@@2 1A;2 p.7:7

    ++

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    45/48

    -.3 %atin) S(ale "nstru*ent >uality ?riteria

    Ratin+ Scale 8nstru,ent Muality Criteria

    Criterion Poor ;air Good Nery Good 5cellent

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    46/48

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    47/48

    TALE 1#.$ Assessment o Fr98t'on: S8*oos 'n Ar9; A;+ 1# 2:$ 200 INPUT: 2"" Persons %0 Items MEASURED: 2"" Persons %0 Items 2 CATS

    Items MAP OF Persons

  • 8/11/2019 0.1 Rasch Workshop Booklet Structure of Measurement

    48/48

    '9R C'*S91TA*TSO

    and published in refereed ournals. 3is paper entitled QClearer to read andeasier to understand: Rasch Analysis in 1earnin+ 'utco,es )on as the 0estConference Paper in 8C55- K1 2?.Plannin+ and MA=MC for Muality 8,pro(e,ent. 3e is currently the Pro+ra,Coordinator for the 5ec.-ip. in Muality /ana+e,ent Syste, at SPAC59ni(ersity Te!nolo+i /alaysia. 8n year 2> and 2$ he )as consecuti(elyna,ed SPAC5 9T/ 0est Nisitin+ 1ecturer. ;or details (isit :http:==))).lin!edin.co,=in=saidfudin

    /'3- SA8-;9-8* /AS'-8B holds a de+ree in Architecturefro, Australia and is a 1ead Assessor trained by *i+el 0auer1td. 9K and RWT9N A.G.%8nternational Ger,any. 3e is alsoan A'TS apan recipient in 5/S. With o(er 24 years ofeperience he has ,ana+ed a di(erse ran+e of proects onaspects of MA=MC for both pri(ate and public sectors )ithspeciJc interest in 8nstitution of 3i+her 1earnin+. 3is paperson uality perfor,ance ,easure,ent based on RaschAnalysis has been accepted as proceedin+s in se(eral

    AR81A3 A0- A8 earned a 0achelor of Science in Co,puterScience and /arthenatics de+ree fro, Kent State 9ni(ersity

    Kent 'hio 9S and her ,asters in /ana+e,ent 8nfor,ationSyste,s fro, the 8nternational 8sla,ic 9ni(ersity /alaysia.3er Ph.- is on co,petency ,easure,ent inde of 8nfor,ationProfessionals usin+ Rasch /odel. 3er papers on Rasch/easure,ent has been presented in international conferencesand published in refereed ournals (iB WS5AS TransactionsournalsB 8555plore *A9* etc.

    She has conducted se(eral )or!shops and short courses re+ularly locallyand abroadB in areas of Perfor,ance /easure,ent Strate+ic Plannin+based on Rasch /odel /easure,ent and Analysis usin+ Winsteps and0ondU;o. ;or further enuiries she can be reached at arilahV+,ail.co,

    http://www.linkedin.com/in/saidfudinhttp://var/www/apps/conversion/tmp/scratch_5/[email protected]://www.linkedin.com/in/saidfudinhttp://var/www/apps/conversion/tmp/scratch_5/[email protected]