Upload
nora-lloyd
View
224
Download
4
Tags:
Embed Size (px)
Citation preview
Basic Measurement and Statistics in Testing
Outline
Central Tendency and Dispersion Standardized Scores Error and Standard Error of
Measurement (Sm) Item Analysis
Central Tendency and Dispersion
Central Tendency
Measures of central tendency are measures of the location of the middle or the center of a distribution. The definition of "middle" or "center" is purposely left somewhat vague so that the term "central tendency" can refer to a wide variety of measures. The mean is the most commonly used measure of central tendency.
Mean The arithmetic mean is what is commonly called the
average. The mean is the sum of all the scores divided by the number of scores.
The formula in summation notation is:
ΣX/N
The mean is a good measure of central tendency for roughly symmetric distributions but can be misleading in skewed distributions since it can be greatly influenced by scores in the tail. Therefore, other statistics such as the median may be more informative for distributions such as reaction time or family income that are frequently very skewed
Median The median is the middle of a distribution: half the
scores are above the median and half are below the median.
The median is less sensitive to extreme scores than the mean and this makes it a better measure than the mean for highly skewed distributions.
Computation of MedianWhen there is an odd number of numbers, the median is simply the middle number. For example, the median of 2, 4, and 7 is 4.
When there is an even number of numbers, the median is the mean of the two middle numbers. Thus, the median of the numbers 2, 4, 7, 12 is (4+7)/2 = 5.5.
Mode
The mode is the most frequently occurring score in a distribution and is used as a measure of central tendency. It is the only measure of central tendency that can be used with nominal data.
The mode is greatly subject to sample fluctuations and is therefore not recommended to be used as the only measure of central tendency. A further disadvantage of the mode is that many distributions have more than one mode. These distributions are called "multi modal."
In a normal distribution, the mean, median, and mode are identical.
Spread, Dispersion, Variability
A variable's spread is the degree to which scores on the variable differ from each other. If every score on the variable were about equal, the variable would have very little spread. There are many measures of spread. The distributions shown below have the same mean but differ in spread: The distribution on the bottom is more spread out.
Variability and dispersion are synonyms for spread.
Spread/Dispersion
Range The range is the simplest measure of spread or
dispersion: It is equal to the difference between the largest and the smallest values.
The range can be a useful measure of spread because it is so easily understood. However, it is very sensitive to extreme scores since it is based on only two values.
The range should almost never be used as the only measure of spread, but can be informative if used as a supplement to other measures of spread.
Example:The range of the numbers 1, 2, 4, 6, 12, 15, 19, 26 = 26 -1 = 25
Variance
The variance is a measure of how spread out a distribution is. In other words, they are measures of variability.
The variance is computed as the average squared deviation of each number from its mean.
For example, for the numbers 1, 2, and 3, the mean is 2 and the variance will be:
(1-2)2 + (2-2)2 + (3-2)2 = 0.667 3
Example of Calculation
Standard Deviation
The standard deviation formula is very simple: it is the square root of the variance. It is the most commonly used measure of spread.
In a normal distribution, about 68% of the scores are within one standard deviation of the mean and about 95% of the scores are within two standard deviations of the mean.
The standard deviation has proven to be an extremely useful measure of spread in part because it is mathematically tractable. Many formulas in inferential statistics use the standard deviation.
Different ways of calculating the standard deviation – the raw score method and the deviation method
Standard deviation score and standard deviation value
Standardized Scores
Z scores and T scores and their uses
Standardized Scores : Z scores Z-score Raw score – mean score / standard dev. Example:
ID X Mean D StdDv Z
1 95 90 5 5 1
2 90 90 0 5 0
3 85 90 -5 5 -1
Standardized Scores : Z scores Using the Z-score Comparing between scores in two tests Example, compare previous score with this:
ID X Mean D S Z
1 3 5.67 -2.67 2.45 -1.09
2 6 5.67 0.33 2.45 0.13
3 8 5.67 2.33 2.45 0.95
Standardized scores – T scores
Z scores are unfamiliar especially with ‘-’ scores Formula for T-score: T = 10 (Z) + 50
ID X Mean D Sd Z T
1 3 5.67 -2.67 2.45 -1.09 39.1
2 6 5.67 0.33 2.45 0.13 51.3
3 8 5.67 2.33 2.45 0.95 59.5
Error and Standard Error of Measurement (Sm)
Error and Standard Error of Measurement (Sm) Every score has an error Error either adds or subtracts from your
true score True score = Obtained score +/- Error How to calculate error? Sm = SD1 - r
Example
Obtained score = 20; SD = 2; r = 0.64 Sm = SD1 - r = 2 1- 0.64 = 2 0.36 = 2 x 0.6 = 1.2 True score = 20 – 1.2 = 18.8; and 20 + 1.2 =
21.2; or Between 18.8 and 21.2 (at 1 SEM)
Item Analysis
Item difficulty
Item discrimination
Distractor analysis
Item difficulty (p) How difficult is the item? Sometimes referred to as item facility. Used only with objective type tests Number of students who got the item correct
divided by the number of students who attempted the item.
Every item has an item difficulty value Possible values are from 0 to 1 with 0 indicating
a difficult item
Example
30 students attempted the item A 4 B 0 C 8 *D 18 Find p p = No. of students who got it right
No of students who attempted = 18/30 = .60 Note, this is also equal to 60 percent correct
Item Discrimination (D) To discriminate between good and weak
students Must determine the good and weak
students first Performance of good students compared
to performance of weak students divided by the number of students in either group
Every item has an item discrimination value which range from -1 to 1
Example Total number of students = 45 Number of students in Upper Group and Lower
Group = 15 each Options A B C *D Upper (Ug) 2 0 3 10 Lower (Lg) 2 1 6 6 Compute D D = No. in Ug correct – No. in Lg correct
No of students in either group D = 10 – 6 = 0.267
15
Deciding on Good and Bad Items
Item difficulty Item discrimination Check for miskeying, ambiguity and
guessing Evidence for miskeying: more chose
distractor than key Guessing: equal spread across options Ambiguity: equal number chose one
distractor and the key
END