25
There are lies, damned lies, and statistics. Measurement Issues in Y Measurement in psychology differs from the p hysical sciences psychological attributes (i.e., constructs) cannot be measured directly  ± THEREFORE We need to find manifest behaviour that is indicative of the construct of interest (i.e., test scores / responses) ALSO theoretical considerations regarding scale type - can a construct be measured using a nominal (categoric al labels), ordinal (order s, e.g., rankings), interval (equal intervals = equal differences but zero is arbitrary, e.g., temperature), or ratio scale (true zero-point)?  ± HOWEVER Measurement is (always) susceptible to error How do we control for errors & evaluate the properties of a task? Classical Test Theory CTT is based on the premise that test scores are influenced by two factors:  ± Factors contributing to consistency the stable attributes that are being measured  ± Factors contributing to inconsistency characteristics of the test, testing situation, or individual that affect the test score error factors 

4 Mental Abilities Note

Embed Size (px)

Citation preview

Page 1: 4 Mental Abilities Note

8/6/2019 4 Mental Abilities Note

http://slidepdf.com/reader/full/4-mental-abilities-note 1/25

There are lies, damned lies, and statistics.

Measurement Issues in Y

Measurement in psychology differs from the physical sciences psychological attributes(i.e., constructs) cannot be measured directly

± THEREFORE We need to find manifest behaviour that is indicative of the constructof interest (i.e., test scores / responses)

ALS O theoretical considerations regarding scale type - can a construct bemeasured using a nominal (categorical labels ), ordinal (orders, e.g.,rankings), interval (equal intervals = equal differences but zero is arbitrary,e.g., temperature), or ratio scale (true zero-point)?

± HOWEVER Measurement is (always) susceptible to error

H ow do we control for errors & evaluate the properties of a task?

Classical Test Theory

CTT is based on the premise that test scores are influenced by two factors:

± Factors contributing to consistency the stable attributes that are being measured

± Factors contributing to inconsistency characteristics of the test, testing situation,or individual that affect the test score error factors

Page 2: 4 Mental Abilities Note

8/6/2019 4 Mental Abilities Note

http://slidepdf.com/reader/full/4-mental-abilities-note 2/25

Tes Re y

Th e STABILI TY & CONS ISTENCY of ates t a s a m eas instrum ent

± = tes ts ar e neve r per f ec tl re iabl e du e to random error s of m easurement

A n unr e liable tes t ca nnot be

pe rf ec tly valid as a m easur e of attribut es or a s a m ean s of pr edicting out com es

Wh at A ff ec ts Reliabilit y A n indi vidual will NOT obtain id enti calscores on all t es t o ccasion s - scor es will be distribut ed normall y around t h e ir 'tru e 'score

± Th is variation d erives from random fa ctor s wit h in t h e person & or environm ent - err orsof m ea su re m e nt - th at ar e found in allpsych ologi cal t es ts

Tes ts wit h small e rror of m easur ement ar e mor e re liable th an t h ose wit h a larg e e rrorof m easurement

Page 3: 4 Mental Abilities Note

8/6/2019 4 Mental Abilities Note

http://slidepdf.com/reader/full/4-mental-abilities-note 3/25

± A n es timat e of t h e size of t h is e rror i s pro videdin calculating r e liabilit y

Th e low er t h e varian ce in error th e close r rxx

is to 1

Th e h igh er t h e varian ce in error th e close r rxx

is to 0

S our ces of Error of Measur ement

Item Se lec tion: ± A tes t is alwa ys a sa m ple of b e h aviour @ not all

po ss ible combination s of a ssess ment ar e mad e

Te st Ad minist ra tion: ±

Environm ental condition s (noi se

h

e at, e tc

¡

± Th e examin er (race , se x, mann er, bod y languag e , e tc ¢

± Th e parti cipant (anxi ety, fatigu e , moti vation, e tc £

Te st Scorin : ± N ot all t es ts (eg. proj ec tive tes ts, ess ay qu es tion s ¤ h ave a

standardiz ed scoring pro ce dur e , leading to anincrease d po ss ibility of e rror of m easurem ent

Th ese th ree sour ces ar e known a s ra nd omm ea su re m e nt err or

C lass ic NIGH TM AR E exampl e

Page 4: 4 Mental Abilities Note

8/6/2019 4 Mental Abilities Note

http://slidepdf.com/reader/full/4-mental-abilities-note 4/25

Th roug h inatt ention you mi ss on e qu es tion in a multipl e-c h oice assess ment & fill in th e an swers to all r emaining it ems in t h e wrongpla ces!

S our ces of Error of Measur ement

A mor e insidiou s occurr ence is Syst e m ati c M ea su re m e nt Err o r:

± a ta sk ma y ina dv er tentl y & consist entl y be assess ing an attribut e ot h er t h an t h e on e of int eres t

± Th e sour ce of e rror in t h is case would b e

X = T + es

+ eu

± Wh ere es = systemati c e rror & e

u =

random error Two e rror compon ent s p e rror of measur ement in creases

Page 5: 4 Mental Abilities Note

8/6/2019 4 Mental Abilities Note

http://slidepdf.com/reader/full/4-mental-abilities-note 5/25

Measur es of Reliabilit y Temporal S tabilit y

A re tes t scor es con sistent o ver tim e (NB . som e res ult s would not b e expec te d to b e stabl e - e .g., t h ose aff ec te d b y normaldeve lopm e ntal ch ang es ¥

Te st- re te st re lia bilit y

± Th e corr e lation (d e not ed a s r1,2

) be tw ee n scor es from t h e sam e subj ec ts given t h e sam e te st aft e r aperiod of tim e h as e lap se d

± Criti cism: peopl e oft en do b e tt e r on t h e sec ond

occasion of t es ting i.e ., pra ctice e ff ec ts ± A lt er na te ( or Para lle l) fo rms re lia bilit y ± Th e corr e lation b e tw ee n scor es from t h e sam e

subj ec ts on 2 t es ts con stru cte d wit h (as far a s po ss ible ) equal cont e nt, rang e & difficult y leve ls

± Criti cism: alt e rnat e form s ar e just t h at, & itemsampling diff e rences ma y introdu ce great e r error of measurem ent for som e individual s

Tes t-Retes t Reliabilit y

Page 6: 4 Mental Abilities Note

8/6/2019 4 Mental Abilities Note

http://slidepdf.com/reader/full/4-mental-abilities-note 6/25

Measur es of Reliabilit y

Inte rnal C on sistency Is a t es t con sistent wit h in it s own it emstru ctur e

Split-h a lf re lia bilit y ± Th e corr e lation b/w t es t it ems from two h alves of a

tes t (admini ste red onl y on ce ) Probl em 1: obtaining equi valent h alves of a

tes t - for t es ts th at b ec om e progr ess ive ly mor e difficult odd & eve n it ems ar e usually compar e d

Probl em 2: compari son i s mad e be tw ee n 2compon e nt s onl y h alf t h e lengt h of t h e original

Page 7: 4 Mental Abilities Note

8/6/2019 4 Mental Abilities Note

http://slidepdf.com/reader/full/4-mental-abilities-note 7/25

tes t - re liabilit y drop s wh e n t h e numb er of items in a t es t d ec reases

M ore items = be tt er re liabilit y e stim a te ± C h an ce an swers play a lesse r rol e in t h e e rror

score Th erefore , t h e corr e lation b etw ee n two

h alves und eres timat es th e re liabilit y of th e compl ete tes t

Probl em 3: Instead of gi ving a single coefficient for a t es t, t h e split-h alf pro ce dur e

gives d iff ere nt coefficie nt s de pending onwh ich items ar e group e d w h en t h e tes t is split

i.e ., t h e me th od i s impr ec ise ± Solution: emplo y a t ec h niqu e th at examin es

th e mean of all po ssible split -h alf coefficient s,& adju sts es timat es for t h e numb er of it ems

Cronb ac h s coe fficie nt a lpha (E ) ± C urr entl y th e mo st wid e ly use d r e liabilit y

es timat e bec au se of it s utilit y across a wid e rang e of a ssess ment in strum ent s

C ronba ch s C oefficient E

rE = coefficient E

N = numb er of it ems W

i

2 = th e varian ce of on e item

¹¹

º

¸

©©

ª

¨ §¹¹ º

¸

©©ª

¨

! 2

21

1 ¦

¦

Ei

N

N r

Page 8: 4 Mental Abilities Note

8/6/2019 4 Mental Abilities Note

http://slidepdf.com/reader/full/4-mental-abilities-note 8/25

7 W i

2 = sum of varian ces of each item

W2 = varian ce of total t es t scores

NB: rE uses item varian ce to a ch ieve th e re liabilit y es timat e

Homog ene ity C ronba ch s E is a mor e se nsitive measure

of t h e degree to w h ich a t es t m easur es asingle factor t h an i s th e split-h alf m eth od

± Exa m p le : say items 1 & 2 on a t es t a ssess personalit y, items 3 & 4 t es t generalknowl edge , items 5 & 6 mat h s e tc

± A typical odd/ eve n split -h alf will produ ce agood r, but t h e tes t is ob viou sly h e te rog eneou s

A s C ronba ch 's E ch ec ks eve ry po ssible split-h alf r, it will pro vide a low r e liabilit y es timat e & warn t h at som eth ing is wrong

A Good Reliabilit y C oefficient D epend s on :

± int end ed u se of t es t score

Page 9: 4 Mental Abilities Note

8/6/2019 4 Mental Abilities Note

http://slidepdf.com/reader/full/4-mental-abilities-note 9/25

± ch ara cte ristics of t h e sampl e ± ch ara cte ristics of t h e tes t ± meth od u se d for r e liabilit y

In general, r e liabilit y is con sidered : ± .90 & h igh er - exce llent ± .80 - .90 - good ± .70 - .80 - mod erat e ± .60 - .70 - qu es tionabl e ± less th an .60 - not a cce ptabl e for a

comm ercial t es t - but oft encon sidered sati sfactor y for

experim entalpurpo ses

Th e G ood S tanford B inet V

± F ull Scale IQ Reliabilit y = 0.95 -0.98

± S ub scale Reliabiliti es = 0.84 -0.89 Th e Raven s Progr ess ive Matri ces

± Tes t-Retes t Reliabilit y = 0.85+ ± Inte rnal C on sistency = 0.90+

Page 10: 4 Mental Abilities Note

8/6/2019 4 Mental Abilities Note

http://slidepdf.com/reader/full/4-mental-abilities-note 10/25

Th e B ad

Myers-B riggs Type Indicator ± Measures psych ologi cal pr ef erences Extro version vs Intro version S ensing vs Intuition Th inking vs Fee ling Judging vs Perce iving

± A rgu es th at p e rsonalit y pr ef e rences ar e larg e ly stabl e

± Tes t-retes t r e liabilit y for sub -scales = 0.4 -0.7

± Roug h ly 50% of parti cipant s rec lassified o ver a 5 w ee k pe riod

± Popular in a rang e of se tting s wh y

Th ere ar e lies , damn ed lies and t h en t h ere s th e Myers-B riggs

Estimating Tru e Scor es

Page 11: 4 Mental Abilities Note

8/6/2019 4 Mental Abilities Note

http://slidepdf.com/reader/full/4-mental-abilities-note 11/25

R EMEM B ER : we can n eve r know anindividual s tru e scor e - but w e can

mak e an es timat e of it ± It is oft en important to know h owclose th e ob se rved scor e is to anindividual s tru e leve l of p erforman ce

Personn e l se lec tion Litigation

C aree r coun se lling Treatm ent

Th is can b e ach ieve d u sing t h e S tandard Error of M easur ement(S EM)

S tandard Error of Measur ement

S tandard d eviation of expec ted

ob se rved scor es wh en t h e tru e scor e is h e ld con stant

± Th e amount of e rror a ssociat ed wit h our m easure

Page 12: 4 Mental Abilities Note

8/6/2019 4 Mental Abilities Note

http://slidepdf.com/reader/full/4-mental-abilities-note 12/25

± B ased on reliability of the measure &variability in test scores

Confidence Intervals based on theSE

m

± SEm

used to give an estimate of the

range the true score is likely to fallwith a specific level of confidence(confidence intervals)

Page 13: 4 Mental Abilities Note

8/6/2019 4 Mental Abilities Note

http://slidepdf.com/reader/full/4-mental-abilities-note 13/25

S tandard Error of Measurement: A

worked example

Intelligence (IQ)

± IQ score = 100 ,S

tandardDeviation = 15; ± Reliability index r

xx= .80

95% confident that the interval 87 to 113 will containthe true score

71.6

)80.1(15

)1(

!

!

!

m

m

xx xm

SE

SE

r SE W

Page 14: 4 Mental Abilities Note

8/6/2019 4 Mental Abilities Note

http://slidepdf.com/reader/full/4-mental-abilities-note 14/25

Inte rpr e tation of Tes t

S cores Exampl e : measured IQ = 105

± S ize of C I depend s on r e liabilit y Reliability S tandard Error of

Measurement (SE m)

Confide

Interval1.00 0

.90 4.7 95 ±

.80 6.7 92 ±

.70 8.2 89 ± 1

.60 9.5 86 ± 1

« « «

Tes t Validit y

Page 15: 4 Mental Abilities Note

8/6/2019 4 Mental Abilities Note

http://slidepdf.com/reader/full/4-mental-abilities-note 15/25

Th e SOUNDNESS & R ELEVA NCE of th e proposed interpretation of t es tscor es

± How w ell does a t es t m easure wh at itis des igned to m easur e

W RON G : "How valid i s th is tes t?" \ ± a t es t r e levant to on e dec ision ma y h ave

no valu e in anot h er RIGH T: "Is th is tes t valid for ...?

Th e re is no such th ing a s THE

validit y of a t es t ± Validit y is NOT ab solut e

S om e th ings to k ee p in mind Reliabilit y & Validit y

Reliabilit y is a NECESSAR Y but NOT SUFF ICIENT pr ec ondition of validit y

± A n unr e liable tes t (i. e ., on e th at pro vides incon sistent scores ) CA NNOT be valid !

H OW EVER

± Tes ts th at ar e re liable ar e NOT necess arily

valid !

C on stru ct Validit y

Page 16: 4 Mental Abilities Note

8/6/2019 4 Mental Abilities Note

http://slidepdf.com/reader/full/4-mental-abilities-note 16/25

Constructs are defined by theory -construct validity concerns thenature of processes captured by atest ± Established by examining the

relationship b/w test scores &other measures

11 )) CC oo nn vvee r r gg ee nn tt VVaa lliidd iitt yy ± ± ii tthh cc oo nn ss tt cc tt r r ee ll ttee dd ttoo oo tthh ee r r tthh ee oo r r ee ttiicc aa llll ss ii iillaa r r cc oo nn ss ttr r uu cc ttss //ttee ss ttss ??

EE xxpp ee cc tt hh iigg hh cc oor r r r ee llaa ttiioo nn wwiitthh ss ii iillaa r r ttee ss ttss

22 )) DDiiss cc r r iimm iinn aa nn tt VVaa lliidd iitt yy ± ± iiss tthh ee cc oo nn ss ttr r uu cc tt iinn dd ee pp ee nn dd ee nn tt oo f f oo tthh ee r r pp ss yycc hh oo lloo gg iicc aa ll cc oo nn ss ttr r uu cc ttss ??

EE xxpp ee cc tt LLoo ww cc oo r r r r ee llaa ttiioo nn wwiitthh oo tthh ee r r cc oo nn ss ttr r uu cc ttss

Page 17: 4 Mental Abilities Note

8/6/2019 4 Mental Abilities Note

http://slidepdf.com/reader/full/4-mental-abilities-note 17/25

O th er Face ts of Validit y th at contribut e to C on stru ct Validit y

Cont e nt Valid ity do es tes t cont ent r e lat e to all a spec ts of t h e con stru ct?

± Establi sh ed b y de te rmining t h at b e h aviour s sampl ed b y a t es t ar e a r epr ese ntati ve sampl e of t h e attribut e be ing m easured i.e ., b y examining t h e tes t it se lf

Important to ac hi e v em e nt tes ts ce rtif yingcomp e te ncy in given ar eas - i.e ., a t es t sh ouldcontain onl y items th at sh ould b e expec ted tobe known e .g., Psych 1002 exam

± A tes t exh ibiting good cont ent validit y willh ave clearl y word ed, unambiguou s items covering t h e main fa cts, con ce pt s, &/orpro cesses of int e res t

C ont ent Validit y Exampl e : Subt es t A rit h metic (WA IS III)

A im: To a sses general mat h emati calreasoning skills

Tes t sh ould contain not onl y ± addition probl ems

B ut al so ± S ub stra ction

Page 18: 4 Mental Abilities Note

8/6/2019 4 Mental Abilities Note

http://slidepdf.com/reader/full/4-mental-abilities-note 18/25

± D ivision, ± Multipli cation, ± F raction s .

O th er Face ts of Validit y th at contribut e to C on stru ct Validit y

Crit er ion- Re la te d Va lid it y parti cularl y important for t es ts use din p ersonn el se lec tion

± Requir es a m easure of p erforman ce (i.e ., crit e rion) pr edicted b y a = tes t

11 )) PP ii ii VV llii ii ± ± ee r r f f oo r r mm aa nncc ee oo nn tt ee cc r r iittee r r iioo nn mm ee aa ss uu r r ee cc oo llll cc ttee dd ll tt ee r r hh ee nn tthh ee ee r r ss oo nn iiss ee ss ttaa bb lliiss hh ee dd iinn tthh ee j joo bb

Page 19: 4 Mental Abilities Note

8/6/2019 4 Mental Abilities Note

http://slidepdf.com/reader/full/4-mental-abilities-note 19/25

22 )) oo nn cc uu r r r r ee nn tt VV lliidd iitt yy ± ± nn oo tt iimm ee dd ii ee r r ee nn cc ee iinn cc oo mm pp aa r r iiss oo nn bb // cc r r iittee r r iioo nn ((cc oollllee cc ttee dd oonn ee ss ttaa bb lliiss hhee dd ee mmppllooyyee ee ss )) && ttee ss tt ss cc oo r r ee ss ((oo f f nnee ww cc aa nndd iiddaa ttee ss ))

IItt iiss aa ss ss uu mm ee dd nn ee ww cc aa nndd iidd aa ttee ss ((iif f ss uu iittaa bb llee )) wwiillll oo bbttaa iinn ss cc oo r r ee ss cc oor r r r ee ss pp oonn dd iinn gg ttoo ss uu cc cc ee ss ss f f uull ee mm pp lloo yyee ee ss ((mm aa ttcc hhee dd oo nn ttee ss tt ss cc oor r ee ss ))

Additional con sideration s inassess ment :

S tandardization & N orm -Ref erence dTes ting

Stan d ar d iza tion ± Tes ts are admini ste red to larg e , repre se nt a ti v e

sampl es of t h e population to a ssess th e psych om etri c prop erti es , & to pro vide normati ve data on t es t p erforman ce

Page 20: 4 Mental Abilities Note

8/6/2019 4 Mental Abilities Note

http://slidepdf.com/reader/full/4-mental-abilities-note 20/25

Tes t admini stration sh ould b e condu ctedexactly th e sam e wa y each tim e to maintainvalidit y (e .g., in stru ction s, tim e limit s, scoringpro ce dur es , situation)

Norm- re f ere nce d Te stin ± C ompari son to a larg e repre se nt a ti v e sampl e

from t h e sam e population e ith er via : S tandard score (e .g., IQ : M=100 S D =15)

OR Perce ntil e scor es

N orm -Ref e rence d Tes ting N orm -ref erence d int e rpr etation i s are lati ve int erpr etation ba se d on anindividual s po sition wit h res pec t to som e group

± N orm s con sist of t h e scores of t h e normati ve (i.e ., compari son) group

Exampl e ± Parti cipant X solved 41 out of 60 it ems in an IQ

tes t Wh at do es th is mean?

± Parti cipant Y solved 24 out of 35 it ems in an IQ

tes t Wh o is smart e r X or Y?

Page 21: 4 Mental Abilities Note

8/6/2019 4 Mental Abilities Note

http://slidepdf.com/reader/full/4-mental-abilities-note 21/25

N orm -Ref e rence d

Inte rpr e tation Targ et group( s) ar e determin ed b y int end ed u se of t es t

Se lec tion of standardi sation sampl e nee ds to b e repr ese ntati ve

± Ideally assess th e wh ole population

± Random sampling oft en impo ss ible ± S tratifi ed S ampling identif y subgroup s (strata) & sampl e fromwit h in t h ese using populationproportion s (randoml y if po ssible )

± C on venience S ampling e .g., fir st-year p sych stud e nt s Represe ntati veness ?

N orm -Ref erence d Tes ts

Page 22: 4 Mental Abilities Note

8/6/2019 4 Mental Abilities Note

http://slidepdf.com/reader/full/4-mental-abilities-note 22/25

Example: S core in a S tatistics AnxietyS cale

± P ossible Normative Groups (subgroups): P SYC100 x S tudents General P opulation P hysics S tudents

Page 23: 4 Mental Abilities Note

8/6/2019 4 Mental Abilities Note

http://slidepdf.com/reader/full/4-mental-abilities-note 23/25

Norm-referenced testing: S cope

of Norms

National Nor s ± Involves many thousands of examinees &

many groups ± e.g., Australian vs. North American norms

Local Nor s

± For specific use in smaller testing applications ± Developed when current sample does notmeet the sample on which the norms weredeveloped

A ge Nor s

Page 24: 4 Mental Abilities Note

8/6/2019 4 Mental Abilities Note

http://slidepdf.com/reader/full/4-mental-abilities-note 24/25

± D eve lop ed for an y ch ara cte ristic th at ch ang es wit h age

e .g., Int e lligence scales

± C an b e con siderabl e overlap b e tw ee n ag e group s

N orm Scales

In case raw scores (number of corect responses are normally distributed than the determination of norm scores is pretty straig ht forward . A normal distributionis sufficiently described by two parameters: the mean score and the standard deviation A n this is acually all we need . W e transform the raw score distribution into z -scores by means of this formular . Th e individual raw score § is compared with the mean raw score divided by the stabndard deviation Th e raw scores are now standardised .

Further linear transformations allow you to transform raw scores from one norm scale into anot her

In case raw scores are not normally distributed there are additional steps necessary Raw scores --> percentile rank scores --> transformation PR = 100* ( cum f -f /2)/NTo interpret norm scores appropriately one needs to know the mena score and the standard deviation of the sc ale

S ummar y

Page 25: 4 Mental Abilities Note

8/6/2019 4 Mental Abilities Note

http://slidepdf.com/reader/full/4-mental-abilities-note 25/25

Measur ement in p sych olog y Tes t Reliabilit y con sistency & prec ision

Tes t Validit y soundn ess & re levan ce of int erpr etation

S tandardization & N orm -Ref e rencedInterpr etation