Chapter 16 ppt eval & testing 4e formatted 01.10 kg edits

© 2013 Springer Publishing Company, LLC.

Chapter 16Interpreting Test Scores

Oermann & GabersonEvaluation and Testing in Nursing Education4th edition


Interpreting Test Scores

♦ A test produces a score– Number with no intrinsic meaning – Must be compared with something that has

meaning♦ Interpretations can be norm- or criterion-

referenced

2


Test Score Distributions

♦ Scoring a test produces a collection of raw scores, recorded by student name or number– Difficult to interpret characteristics of the scores

♦ Arrange in rank order, highest to lowest – Reveals range of scores– Still difficult to judge how a typical student

performed on the test or other characteristics of the obtained scores

3


Test Score Distributions

♦ Frequency distribution – Remove student names or numbers– List each score once– Tally number of times each score occurs– Identify how well the group of students performed on the

exam more easily – Can represent graphically as a histogram or frequency

polygon • Display scores that occurred most frequently, score distribution

shape, range

4


Characteristics ofScore Distributions♦ Symmetry♦ Skewness♦ Modality♦ Kurtosis

5


Symmetry

♦ Symmetric distribution or curve – Equal halves, mirror images of each other

♦ Nonsymmetric or asymmetric distribution or curve– Scores cluster at one end, tail toward other end– Most nursing test score distributions

6


Skewness

♦ Skew—direction in which the tail extends – Positive skew—tail toward the right (in the

direction of positive numbers on a scale)• Positively skewed distribution—cluster of scores at

low end

– Negative skew—tail toward the left (in the direction of negative numbers)• Cluster of scores at the high end• Most nursing test score distributions

7


Modality

♦ Number of peaks (cluster of scores) in the distribution

♦ Mode– Most frequently occurring score in the distribution

♦ Unimodal—one peak♦ Bimodal—two peaks♦ Multimodal—many peaks

8


Kurtosis

♦ Relative flatness or peakedness of the curve ♦ Platykurtic—relatively flat, gently curved ♦ Mesokurtic—moderately curved♦ Leptokurtic—sharply peaked

9


“Curving” Grades

♦ Not appropriate if scores lack characteristics of a normal curve– Bell-shaped: symmetric, unimodal, mesokurtic


“Curving” Grades

♦ Most score distributions from teacher-made tests not normally distributed

♦ Shape of distribution affected by:– Test characteristics• Difficult test → positively skewed curve

– Ability of students• Nursing content knowledge not normally distributed

– Students admitted to nursing program not representative of general population

11


Measures of Central Tendency

♦ Ways of indicating the score that is most characteristic or typical of the distribution

♦ “Middle” of a distribution, scores tend to cluster around it

♦ Three measures– Mode– Median– Mean

12


Mode

♦ Most frequently occurring score in a distribution♦ Must be an actual obtained score ♦ Identified from frequency distribution or graphic

display without mathematical calculation♦ Rough indication of central tendency♦ Least stable measure of central tendency– Can fluctuate considerably among samples drawn from the

same population

13


Median♦ Point that divides a score distribution into equal halves ♦ 50th percentile—50% of scores are above and 50% are below♦ Does not have to be an actual obtained score

– Even number of scores—median is halfway between the two middle scores

– Odd number of scores—median is the middle score

♦ Index of location—not influenced by the value of each score – Good for skewed distribution

14


Mean

♦ Mathematical average of all scores– Computed by summing individual scores and dividing by

the total number of scores– Does not have to be an actual obtained score

♦ Value of the mean is affected by every score in the distribution – Influenced by extremely high or low scores– Not the most accurate measure of central tendency in

highly skewed distributions

15


Selecting a Measure ofCentral Tendency♦ Relationship between shape of a distribution

and locations of measures of central tendency– Normal distribution• Mean, median, and mode have the same value

– Positively skewed distribution• Mean is highest, mode is lowest

– Negatively skewed distribution• Mode is highest, mean is lowest

16


Measures of Variability

♦ Used to determine how similar or different the test scores are

♦ Score distributions may have similar measures of central tendency and different degrees of variability

♦ Most common measures– Range– Standard deviation

17


Range

♦ Simplest measure of variability ♦ Difference between the highest and lowest scores in

the distribution– Sometimes expressed as highest and lowest scores, rather

than a difference score (e.g., 42 to 60)

♦ Can be highly unstable—based on only two values♦ Tends to increase with number of scores– Wider range of test scores from large group of students

because of likelihood of an extreme score

18


Standard Deviation (SD)

♦ Most common and useful measure of variability♦ Takes every score in the distribution into

consideration♦ Based on differences between each score and the

mean♦ Represents average amount by which scores differ

from the mean– Smaller if scores cluster tightly around the mean– Larger if scores widely scattered over large range

19


Interpreting an Individual Score

♦ Scores on teacher-made tests– Norm-referenced interpretations• Use mean and SD to interpret individual scores

– Criterion-referenced interpretations• Used in most nursing education settings• Scores are compared to a preset standard• Example: percentage-correct score

– Comparison of a student’s score with the maximum possible score

20


Percentage-Correct Scores

♦ Derived (not raw) score♦ Often used as a basis for assigning grades♦ Determined more by test item difficulty than by

quality of performance – If test is more difficult than expected, teacher may

want to adjust the raw scores before calculating the percentage correct

♦ Not to be confused with percentile score– Norm-referenced interpretation

21


Interpreting an Individual Score♦ Scores on standardized tests– Usually used to make norm-referenced interpretations– More relevant to general rather than specific

instructional goals• Should not be used to determine course grades

– Usually reported in derived scores• Percentile ranks• Standard scores• Norm-group scores

(cont’d)

22


Interpreting an Individual Score

♦ Scores on standardized tests (cont’d)– Important to specify an appropriate norm group

for comparison – User’s manual includes norm tables with

descriptions of each norm group– Teacher should select the norm group that most

closely matches the group of students • Examples: type of nursing program, public or private

23

Documents

Chapter 16 ppt eval & testing 4e formatted 01.10 kg edits