16
Chapter 9 Analyzing Bias and Assuring Fairness p206 Unfair Discrimination Item & Test Bias Test-Score Banding Chapater 9 Analyzing Bias and Assuring Fairness 1

Chapter 9 Analyzing Bias and Assuring Fairness p206

Embed Size (px)

DESCRIPTION

Chapter 9 Analyzing Bias and Assuring Fairness p206. Unfair Discrimination Item & Test Bias Test-Score Banding. Bias defined “Systematic group differences in item responses, test scores, or other assessments for reasons unrelated to the trait.” Cultural bias defined - PowerPoint PPT Presentation

Citation preview

Page 1: Chapter 9 Analyzing Bias and Assuring Fairness  p206

Chapater 9 Analyzing Bias and Assuring Fairness

1

Chapter 9 Analyzing Bias and Assuring Fairness p206

• Unfair Discrimination• Item & Test Bias• Test-Score Banding

Page 2: Chapter 9 Analyzing Bias and Assuring Fairness  p206

2

– Bias defined• “Systematic group differences in item responses, test scores,

or other assessments for reasons unrelated to the trait.”

– Cultural bias defined• “ if an acceptable response depends on skills or information

common in one culture but not in the other.”

– Discrimination defined• “Making distinctions”

– – not same as unfair discrimination

• Define “unfair” discrimination• What’s the differences between the two –give an example

Chapater 9 Analyzing Bias and Assuring Fairness

Page 3: Chapter 9 Analyzing Bias and Assuring Fairness  p206

Chapater 9 Analyzing Bias and Assuring Fairness

3

DISCRIMINATION

• Discrimination Based on Group Membership– Protected groups• Race• Color • Religion • Gender• Nat’l origin• LGBT?

Page 4: Chapter 9 Analyzing Bias and Assuring Fairness  p206

Chapater 9 Analyzing Bias and Assuring Fairness

4

Distributional Differences

Group Mean Differences (Give an example for each below)

1. Two groups are biased samples (from respective populations)E.g. extensive uncritical recruiting for lower scoring group Would not be biased (why not?)

2. Two groups are representative (not biased if actually differ on the trait)

3. Test items require experiences not common to lower scoring group (not biased if experiences required)

4. Test administration conditions differ for the two groups

Page 5: Chapter 9 Analyzing Bias and Assuring Fairness  p206

Chapater 9 Analyzing Bias and Assuring Fairness

5

Racial Differences in IQ

• Few believe there are no race differences– Means for:

• East Asians 105• Europeans (Whites) 100• Blacks 85

– Cohen effect size• Hispanics .6 to .8 SD < Whites • Blacks 1 SD <Whites

• Many argue about the causes• Predictability of IQ for is comparable for blacks and

whites

Page 6: Chapter 9 Analyzing Bias and Assuring Fairness  p206

Chapater 9 Analyzing Bias and Assuring Fairness

6

Race Differences in IQ (Furnham ’08, p 207)

• Three plausible explanations 1. Evidence of biological & genetic differences

between races2. Evidence of sociocultural, economic & political

forces for differences -distinct from racial characteristics-But confounded with them

3. Differences are only artifacts of test design, administration, or measurement -no real differences

Page 7: Chapter 9 Analyzing Bias and Assuring Fairness  p206

Chapater 9 Analyzing Bias and Assuring Fairness

7

Black-White Racial Differences in IQ

• Greater variation within groups than between– 16% Blacks score above the White mean– For a cutoff of 70 score for special education• There will be 1 White for every 7 Blacks

– Black/White differences are constant over time and life span

– Differences are present prior to school entry– Differences are not constant for diff types of

measures of intelligence

Page 8: Chapter 9 Analyzing Bias and Assuring Fairness  p206

Chapater 9 Analyzing Bias and Assuring Fairness

8

Black & White Differences in IQ(implications for workforce) Gottfredson (2002)

• 22% Whites & 59% of Blacks have IQ < 90– Considerably fewer Blacks (proportionately) are

competitive for mid-level jobs: • fire fighting, skilled trades, many clerical jobs

– Mean IQ is about 100 (1 SD above mean for Blacks)– 80 is the threshold for being competitive in lowest level jobs

» 4 times as many Blacks (30%) cf Whites (7%) fall bellow that threshold

Page 9: Chapter 9 Analyzing Bias and Assuring Fairness  p206

Chapater 9 Analyzing Bias and Assuring Fairness

9

Implications for Black / White IQ Differences

• On the higher end of the distribution (IQ =125)– Score of 125 = mean for professionals (e.g. lawyers,

physicians, engineers, high-level executives etc.)

• Black / White ratio is only 1:30 at this level• Conclusion: Disparate impact

• with legal and political tension…• Is “particularly acute in the most complex, most socially

desirable jobs” (Gottfredson, ’02, p. 41).

Page 10: Chapter 9 Analyzing Bias and Assuring Fairness  p206

Chapater 9 Analyzing Bias and Assuring Fairness

10

• Differences in Other Distributional Characteristics (table 9.1 p211)– Note: group means are different, but variability is

greater– At lower selection ratios, differences in proportions may

disappear.

• Discrimination as Systematic Measurement Error– If discrimination error is systematic and more for one

group than the other (e.g. test taking habits) – can be unfair even if not illegal

Page 11: Chapter 9 Analyzing Bias and Assuring Fairness  p206

Chapater 9 Analyzing Bias and Assuring Fairness

11

ANALYSIS OF BIAS AND ADVERSE IMPACT IN TEST USE

• Test bias • Unwanted sources of variance in scores from different

groups

•Adverse impact• Social, political or legal term (effects of test use)

Page 12: Chapter 9 Analyzing Bias and Assuring Fairness  p206

Chapater 9 Analyzing Bias and Assuring Fairness

12

ANALYSIS OF BIAS AND ADVERSE IMPACT IN TEST USE

• Test Bias as Differential Psychometric Validity– Bias = “when groups matched on the trait have different

scores because of one or more sources of variances related to group membership”

1. It is the “Meaning inferred” from scores may or may not be biased (Not the test itself)

2. It is group related (not just for a single individual)3. Groups must be assumed to be equal on the trait4. Definition emphasizes sources of group variances

(potentially identifiable) (not on group means)-e.g. “stereotype threat” (Steele & Aronson, ‘95)

Page 13: Chapter 9 Analyzing Bias and Assuring Fairness  p206

Chapater 9 Analyzing Bias and Assuring Fairness

13

ANALYSIS OF BIAS AND ADVERSE IMPACT IN TEST USE

• Adverse Impact (legal term, not statistical)– Mean differences alone do not indicate bias

• How does this “attitude problem” force adversarial roles?• What’s a better term?

– Adverse impact reasons:1. Chance (not due to bias)2. Measurement problems 3. Nature of test use4. Differences in distribution sizes5. Reliable sub-group approaches to test taking6. True population differences in trait (not due to bias)

1. NOTE TABLE 9.2 P 216

• Criterion Bias (criterion must be valid)

Page 14: Chapter 9 Analyzing Bias and Assuring Fairness  p206

Chapater 9 Analyzing Bias and Assuring Fairness

14

DIFFERENTIAL ITEM FUNCTIONING(DIF)

• DIF preferred over ‘bias’– “Simple minded item difficulty statistics”• You can’t consider the item itself (dependent upon the

trait distribution –thus confounded with it)

– Court cases:• Golden Rule Insurance Company v. Washburn (‘84)

– Mandated that group item difficulty could not differ by more than .15!!

• Allen v. Alabama State Board of Education (‘85)– More restrictive – not more than .05 max difference!!!

Page 15: Chapter 9 Analyzing Bias and Assuring Fairness  p206

Chapater 9 Analyzing Bias and Assuring Fairness

15

ACTING ON THE FINDINGS

• Corrective Actions (4) Under the Uniform Guidelines – p 218– Should we maximize the criterion performance or avoid the appearance of

discriminatory practice?– To ease tensions how should the Ferguson police dept deal with the

imbalance in B &W police officers as it reflects the population’s racial mix?

• Score Adjustments– Race norming in U.S . Employment Service (GATB)

• Scores of Hispanics, Blacks and Whites were % ile ranks within groups• What effect did this have ?

– Employment Quotas • USTES • Are quotas acceptable in other countries?

Page 16: Chapter 9 Analyzing Bias and Assuring Fairness  p206

Chapater 9 Analyzing Bias and Assuring Fairness

16

Analysis of Bias (con’t)

• “Ranges of Indifference” in Test Score Bands– Band Width

• They exist whatever you do…so how to decide?• Standard error of the difference in scores (sd = sm √ 2 )• Adjustment in band with should be based on judgments re: loss of

utility

– Decisions Within Bands– Fixed Bands (don’t slither down)– Sliding Bands (slither down)– Rubber Bands

• What are these used for?