24
Defining, Measuring, and Manipulating Variables

Defining, Measuring, and Manipulating Variables

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Defining, Measuring, and Manipulating Variables

Defining, Measuring, and Manipulating Variables

Page 2: Defining, Measuring, and Manipulating Variables

Operational definition of a construct Constructs:

Hunger, aggression, happiness, success, intelligence …

Operational definition: How is the construct measured?

Hunger: scale of 1-7 subjective feeling

Hunger: # hrs since last ate

Accuracy of operational definition Circular or tautological definitions

Definitions may not match construct

Definitions may differ between researchers

Page 3: Defining, Measuring, and Manipulating Variables

Caffeine consumption questionnaire

How is caffeine consumption operationally defined?

Page 4: Defining, Measuring, and Manipulating Variables

Scales of measurement Nominal, ordinal, interval, ratio (p61)

Also distinguished as discrete vs. continuous variables

Or qualitative vs. quantitative

Page 5: Defining, Measuring, and Manipulating Variables

What scale of measurement is used?

Item Scale of measurement

True-false test

IQ test scores

Political affiliation

Top 10 basketball teams

Time to finish an exam

List of favorite to least favorite teachers

Zip code

Class rank

Nominal

Interval

Nominal

Ordinal

Ratio

Ordinal

Nominal

Ordinal

Page 6: Defining, Measuring, and Manipulating Variables

What scale of measurement is used? Indicate your attitude toward scientific research by placing a

check mark on each scale Positive __ __ __ __ __ __ __ Negative Worthless __ __ __ __ __ __ __ Valuable Unethical __ __ __ __ __ __ __ Ethical

Circle your answer: Scientific research has produced many advances that have

significantly enhanced the quality of human life. strongly agree agree neutral disagree strongly disagree

Above examples use “Likert scale” Each response can be numbered from 1 – 7 = interval scale

Page 7: Defining, Measuring, and Manipulating Variables

Weinle (2003) Examine use of drawing to facilitate kid’s narrative about

emotional events. Participants: 6, 7, 8-yr-olds Method: Interviewed about “mad” or “sad” events ½ asked to draw picture while talking; ½ just talked Results: Children who drew while talking provided

significantly longer and richer narratives

What are the scales of measurement for IVs and DV? IV:

Age = ratio; Activity while talking = nominal; Emotion of event = nominal

DV: Length of narrative = ratio; Richness of narrative = interval or

ordinal

Page 8: Defining, Measuring, and Manipulating Variables

Caffeine consumption questionnaire What scales of

measurement are used?

What other questions could be asked that use other scales?

Nominal

Ordinal

Interval

Ratio

Page 9: Defining, Measuring, and Manipulating Variables

Types of measures (p65)

Page 10: Defining, Measuring, and Manipulating Variables

What type of measurement is used? What scale of measurement? Geriatric Depression Scale (GDS)

Choose the best answer for how you have felt over the past week: YES / NO

1. Are you basically satisfied with your life?

2. Have you dropped many of your activities and interests?

3. Do you feel that your life is empty?

4. Do you often get bored?

5. Are you in good spirits most of the time?

6. Are you afraid that something bad is going to happen to you?

7. Do you feel happy most of the time?

8. Do you often feel helpless?

9. Do you prefer to stay at home, rather than going out and doing new things?

10. Do you feel you have more problems with memory than most?

11. Do you think it is wonderful to be alive now?

12. Do you feel pretty worthless the way you are now?

13. Do you feel full of energy?

14. Do you feel that your situation is hopeless?

15. Do you think that most people are better off than you are?

Page 11: Defining, Measuring, and Manipulating Variables

Reliability “Consistency and stability of a measuring instrument”(p65)

Is the scale free from random error?

Observed score = true score + error “High reliability” = low error

Types of errors: Method error (e.g. test situation, equipment error) Trait error (e.g. fatigue, health, truthfulness)

Theoretical reliability True score / true score + error score

Measured reliability Correlation coefficient: -1.0 to 0 to +1.0 .70 – 1.0 Strong; .30 - .69 Moderate; .00 - .29 Weak Not all-or-none; a more or less reliable measure

Page 12: Defining, Measuring, and Manipulating Variables

Correlational design Scatterplot: relationship between 2 quantitative variables

How 1 variable relates to or influences another variable

Individual =

dot (X and Y

data point)

Page 13: Defining, Measuring, and Manipulating Variables

Lexical decision task and measurement error

Press “yes” button when you see a word (“crow”) Press “no” button when you see a non-word (“cwor”) IV: stimulus (word/non-word) DV: RT (time to press button)

Types of measurement errors? Ss responds more slowly on later trials due to fatigue Ss responds more quickly b/c just saw the word before coming to

the lab Ss responds more slowly because sneezing during trial Ss performs poorly b/c can’t read words clearly on screen; b/c room

is too warm; b/c thinking about other things…

Something affects behavior other than the variable you are studying

Page 14: Defining, Measuring, and Manipulating Variables

Types of reliability How can you measure reliability? Test-retest reliability

Compare same test on 2 occasions

Alternative forms reliability Compare equivalent or similar tests

Split-half reliability Compare performance on 2 halves of a test

Inter-rater reliability

Consistency/agreement between 2 judges # agree / # possible agree x 100 What types of measures would use this?

Page 15: Defining, Measuring, and Manipulating Variables

Kazdin (1990): Automatic thoughts questionnaire

“An examination of the internal consistency of the ATQ yielded a coefficient alpha of .96… These statistics suggest a high level of internal consistency.”

Reliability statistic: Chronbach’s alpha

Average correlation among all items of scale

.70 – 1.0 Strong; .30 - .69 Moderate; .00 - .29 Weak

“Individual item-total score correlations, presented in Table 1, were in the moderate to high range (r’s = .39 to .81). The mean item-total correlation… was .69.”

Reliability measurements: Inter-item correlation matrix

All correlations should be positive

Page 16: Defining, Measuring, and Manipulating Variables

Validity Does measure provide info on what we really want to

measure?

Multiple types of validity

Content validity

Criterion validity

Construct validity

Validity is not all-or-none, but on a scale

Can be high in 1 type of validity and low on others

Later… (ch8)

Internal validity: eliminated extraneous variables

External validity: findings will generalize to other contexts

Page 17: Defining, Measuring, and Manipulating Variables

Content validity Does test have representative samples of behavior

Does content of test reflect what we want to measure?

Are all aspects of content represented fairly?

e.g. exams in courses Do test items match with what you’ve learned/studied?

e.g. depression questionnaire Does test measure all behaviors that would be of interest?

The more specific the variable, easier it is to get good content validity

Face validity: does it appear to be valid Examine what test appears to measure on surface

But, does not provide any real evidence

Page 18: Defining, Measuring, and Manipulating Variables

Criterion validity Extent predicts behavior or ability in area

Compare scores on measure with another criterion (area)

Concurrent validity

Test used to predict present performance

e.g. pilot or driving test

Predictive validity

Test used to predict future performance

e.g. SAT or GRE

Convergent validity

Significant (pos or neg) correlations found where expected

Discriminant (or divergent) validity

Zero correlations found between variables supposed to be unrelated

Page 19: Defining, Measuring, and Manipulating Variables

Construct validity Degree test accurately measures construct

Examine if concept is being operationalized in a useful way

e.g. depression questionnaire

Is test measuring same construct in all populations that are tested (young – older adults; all cultures)?

e.g. induce depression

Ss read positive or negative statements to induce or diminish depressed mood but does it resemble naturally occurring depression? Does method have construct validity?

Page 20: Defining, Measuring, and Manipulating Variables

Kazdin (1990): ATQ “Criterion validity: Depressed versus nondepressed children” … A

one-way ANOVA of total ATQ scores indicated that depressed children were significantly higher in negative thoughts (M = 82.8) than were nondepressed children (M = 52.9), F(1, 136) = 47.02, p < .001. Overall ATQ score is predicting which group children belong to.

Another section examines which particular items (or statements on scale) distinguish groups

“Convergent validity. … As shown in Table 2 performance on ATQ correlated significantly with other measures of cognitive processes related to depression. Children who indicated more negative thoughts showed lower self-esteem, greater hopelessness, and more external attribution of control. The correlations … support convergent validity of the ATQ.”

Page 21: Defining, Measuring, and Manipulating Variables

Kazdin (1990): ATQ “Discriminant validity. … As can be

seen in Table 2, the ATQ did not correlate significantly with severity of impairment or social competence. These findings would seem to support the discriminant validity of the ATQ. However the absence of … correlations might have been due to the different raters (children vs parents).”

“These results suggest that the ATQ tended to correlate more highly with other measures of cognitive processes and with depression than with measures of prosocial behavior and positive affective experience.

Page 22: Defining, Measuring, and Manipulating Variables

Reliability and validity What is relationship between reliability and validity?

Can measure be valid w/o being reliable?

No

Can measure be reliable w/o being valid?

Yes

Can give same score each time but not give any useful information!

e.g. Measure height as estimate of intelligence

Page 23: Defining, Measuring, and Manipulating Variables

The SAT: What type of validity? Is the SAT useful in predicting how well students

perform in college?

Does SAT-math test concepts from math courses at high school level?

Do SAT questions measure “academic strength”?

Are SAT-math and SAT-verbal positively correlated?

Page 24: Defining, Measuring, and Manipulating Variables

The SAT Is the SAT useful in predicting how well students

perform in college?

Criterion or predictive validity

Does SAT-math test concepts from math courses at high school level?

Content validity

Do SAT questions measure “academic strength”?

Construct validity

Are SAT-math and SAT-verbal positively correlated?

Convergent validity