20
EDUC 5535 Spring 2013

Validity

Embed Size (px)

DESCRIPTION

Validity. EDUC 5535 Spring 2013. Standardized Tests: A Review. An artifact of the eugenics movement (in the 1920’s) - an attempt to sort people by their perceived intelligence or ability. Short buzz : How are tests used to sort?. Examples. Sorting Purpose Who goes to what college - PowerPoint PPT Presentation

Citation preview

EDUC 5535Spring 2013

An artifact of the eugenics movement (in the 1920’s) - an attempt to sort people by their perceived intelligence or ability.

Short buzz:How are tests used to sort?

Test SAT

ITBS (Iowa Test of Basic Skills)

CELA

Sorting Purpose Who goes to

what college Who is on grade

level, who isn’t Who is proficient

in English/Who isn’t

Were created because it was important to know if the tests used for sorting were accurate sorting measures (e.g. Are we sure that the children we are putting in special education really need to be there? Are we sure the ELL children know enough English to achieve in English medium classrooms?)

Validity - Does the test measure what it purports to measure◦Construct◦Content◦Concurrent - criterion◦Predictive◦Consequential

More importantly - threats to validity and reliability

The extent to which an assessment procedure adequately represents the content of the curricular aim being measured.

How do we determine content validity?◦ Expert review (domain experts)◦ External reviewers (to check on what the experts

created)◦ Expert Panel

1. Which of these countries is not in South America?◦ Brazil◦ Canada◦ Argentina◦ Venezuela

2. How many continents are there?

Threats to validity?◦ Reading and writing skill (if you do not read well you

may not do well on the test)◦ Cultural differences

Red is to firetruck as _______________is to lemons.

Tortillas are to ________________ as _____________ is to a toaster.

Threats to validity – no OTL analogy, cultural issues, reading level.

• • •• •• •• •

Every test in English is first and foremost a test of English.

Every paper/pencil test no matter what they content is first and foremost a test of literacy.

All tests have inherent cultural, linguistic and economic biases

The extent to which empirical evidence confirms that an inferred construct exists and that a given assessment procedure is measuring the inferred construct accurately.

Construct validity is also determined by:◦ Content experts◦ External reviewers◦ Expert panels

Content Conventions Spelling Genre Audience A writing assessment that has construct

validity has all of the above constructs What might be constructs related to

being a good reader?

Which title should be underlined?◦ America the Beautiful◦ Gone with the Wind◦ Damn Yankees

Threats to construct validity?◦ Child knows the rule for underlining but does not

know which of the above is a song, book or play

Read the text provided to you Answer the comprehension questions Final comprehension question: Choose a

title for this essay.

Scoring: Final question weighted more heavily

because it is an inferential question Threats to validity?

Do outcomes from one assessment correlate positively to another assessment that purports to measure the same constructs?

Concurrent Validity is Measured by comparing outcomes of one assessment to another (e.g. compare LAS to CELA; ITBS to CSAP; SAT to ACT) look for correlations above .50

Does the test/assessment predict future performance or behavior?

Do SATs predict preparedness for college? Do GREs predict preparedness for graduate

school? Do 3rd grade reading test scores predict who

will struggle in high school? Do school readiness test predict who is ready

for kindergarten? Does DIBELS predict reading comprehension?

What are the consequences of outcomes on the assessment? Are the outcomes used the way the test creators intended?

Threats to validity Tests given to a population they were not

intended to be given to (e.g. CSAP to ELLs)

Tests used for unintended purposes (e.g. to rank schools and deem some ‘good’ and others ‘bad’

Tests measure language skills of students and not knowledge of content

Tests contain economically, linguistically and culturally biased items

Tests were created for one population and given to another (e.g. ITBS or CSAP with L2 students)

Modifications of tests for L2 are inconsistent

Improperly trained people are administering the tests (e.g. paraprofessionals administering ACCESS)

Invalid tests are often highly reliable The consistent use of invalid tests creates the façade

of an achievement gap

Shepard discusses uses and abuses of tests (these are validity issues)

In your group identify what you think are the 3 most salient to YOUR work

Do these match practice in your district/school?

What should we do?