48
Issues in Research Reliability : The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it is intended to measure The truthfulness of a measures. A test can be reliable and not be valid. Internal validity: the extent to which changes in the dependent variable can be attributed to the influence of the independent variable rather than to confounding variables. Degree to which researchers can draw accurate conclusions about the effects of the independent variable.

Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

Issues in ResearchReliability: The consistency and stability of a

measure or score.

Validity: the extent to which a measure actually measures what it is intended to measure• The truthfulness of a measures.• A test can be reliable and not be valid.

• Internal validity: the extent to which changes in the dependent variable can be attributed to the influence of the independent variable rather than to confounding variables.– Degree to which researchers can draw accurate

conclusions about the effects of the independent variable.

Page 2: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

Issues in Research

• Construct validity: The extend to which a measure measures the conceptual variable it is designed to measure.

• Convergent validity: the extent to which scores on a measure correlate with scores on a different measure of the same construct.

• Discriminant validity: the extent to which scores on a measure do NOT correlate with scores on a measure of a different construct.

Page 3: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

Developmental Research Methods (Schmidt & Teti, 2006)

Parameters of Developmental Change:• Age: chronological • Cohort: group of individuals born at the same time

period experiencing the same events• Time of Measurement: calendar time or period

Interindividual differences: differences b/w individuals Intraindividual change: differences within an individual

over time

Page 4: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

Cross-Sectional Designs• At least two samples of difference ages and cohorts

measured at the same time– Compare memory strategies of 4 and 6 year olds.

• Age group and interindividual differences Problems/Issues:• Does NOT assess age changes, the stability of a

characteristics over time, or intraindividual change• Compares group means (looses individual info) • Age by Cohort confound: results may be affected by

historical or cultural differences b/w cohorts. • Limits external validity (generalizability)

– Historical or cultural difference between cohort

Page 5: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

Longitudinal Designs

• Individuals of a given age and cohort are followed over time.– Compare how memory strategies change between 4

and 6 years of age. • Intraindividual change and age changes • Developmental sequences and co-occurring

social and environmental change • Early-later relations• Shape of developmental function

Page 6: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

Longitudinal Designs

Issues/Problems:• Expensive, time consuming• Sampling biases: those who participate in study

may be different than those who don’t participate• Populations changes over time (representative at

time 1 but not at time 2)• Attrition (drop out, moves, death), those who drop

out typically differ from those who stay. – Siegler and Botwinich (1979) study

• Testing and practice effects: familiar with measures, do better over time

Page 7: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

Longitudinal DesignsIssues/Problems:• Instrumentation: different tests at different ages

– validity: are you measuring the same construct

• Same tests may become dated, obsolete• Changes in study personnel • Changes in how researchers define and measure the IV

and DV.• Theories and hypothesizes may change, new findings from

other sources. • Cohort Effect: unique experience of the cohort• Age by time of measurement confound

– Difficult to separate

Page 8: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

Longitudinal DesignsRegression to the Mean:• Tendency for scores to regress (move towards) the

mean on a subsequent test. • Less error variance over time• More common when measuring extreme scores

If participants are selected because they have extreme scores on the pretest (e.g. select a school with very poor reading ability) there may be other factors due to measurement error that resulted in such low scores at the pretest (tired, bad day etc.) that may have slightly deflated their scores.

Measurement error causes extreme scores to be biased in the extreme direction (away from the mean)

So when you test them a second time it is unlikely that you will have those same factors that may have deflated their scores and their scores will increase a bit and make it look like the program has an effect.

Page 9: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

Time-Lag Designs (Miller 1998)

• Study individuals of the same age at different points in time – Study 6 year olds in 2006, 2008, and 2010

• Does not give information on age changes or differences– validity: are you measuring the same construct

• Provides information on factors that may confound age comparisons in other designs.– Generational factors (cohort effects) and time of

measurement confounds.

Page 10: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

Time of Measurement

2006 2008 2010 2012

1998 8 10 12 14

2000 6 8 10 12

2002 4 6 8 10

2004 2 4 6 8

Yea

r of

Birt

h (C

ohor

t)

Longitudinal

Time-LagCross-Sectional

Page 11: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

Complex DesignsSequential Design: combination of cross-sectional,

longitudinal, and/or time-lag designs. • Allows researcher to separate effects of age, cohort, and

time of measurement• Cohort-Sequential Design: follows different cohorts over

time– Two overlapping longitudinal studies. – E.g., 6 yr olds followed from 2005-2015, and 6 yr olds from 2010-

2020

• Age not confounded with cohort (because different times of measurement)

• Not limited to one cohort• Cross-sectional, longitudinal, and time-lag dimensions

Page 12: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

Complex Designs

Time- Sequential Design: Two or more cross-sectional studies conducted at different times of measurement. – 6, 8 and 10 yr olds compared in 2005, 2010, and 2015.

• Age and time of measurement are not confounded

• Complex designs are very costly, and time consuming

Page 13: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

Quasi-Experimental Designs (Leary, 2004)

• When researchers can not manipulate the independent variable, rather it is a grouping variable (gender, age, disability) and equivalence between the groups can not be ensured

• Researchers can not randomly assign participants to groups, thus they lack control over extraneous variables

• Quasi-independent variable: is not a true independent variable but usually occurs naturally or can not be manipulated. • Researchers still look for effect of the quasi-

independent variable.

Page 14: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

• Quasi-experimental designs usually have lower internal validity than true experiments.

Pretest-Postest designs • Test participants before and after the quasi-

independent variableOne group: measure participants before and after the

quasi-independent variable. Only have one group of participants (those that experienced the quasi-independent variable)• Test reading before children at school A start reading

program and then test their reading after they finish the reading program.

• O1 X O2

Page 15: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

• Threats to internal validity• Maturation: students may have matured over the

reading program. They may have got better at reading just because of time and not due to the program.

• History Effects: something other than the independent variable may have occurred between the pretest and posttest.

• Pretest sensitization: taking the pretest may change the participants reaction to the posttest.

• Regression to the mean: Tendency for extreme scores on pretest to regress (move towards) the mean on a subsequent test (posttest).

Page 16: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

Nonequivalent Control Group Design: • We cannot randomly assign participants to control and

study groups, so we select a control group that is similar to the group that gets the quasi-independent variable.

Posttest-only: measure both groups after one group has received the treatment.• Measure reading in School A and School B after School A

has participated in the reading program.

Quasi-experimental group X O

Nonequivalent control group -- O

• Selection bias: we do not know whether the two groups were similar before the intervention

Page 17: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

Pretest-Posttest design: Test both groups before one groups gets the intervention, then test both groups again after one group gets the intervention (quasi-independent variable)

Quasi-experimental group O1 X O2Nonequivalent control group O1 -- O2• Allows researchers to see if the two groups

scored similarly on the dependent variable before the introduction of the treatment.

• To determine if the quasi-independent variable had an effect you want scores to change between pretest and posttest ONLY for the Quasi-experimental group and NOT for the Nonequivalent control group

Page 18: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

Time Series Designs• Measure the dependent variable many times

before and after the quasi-independent variable is introduced

Simple interrupted time series design• Researchers make a series of observations of the

dependent variable before and after the treatment is introduced

O1 O2 O3 O4 X O5 O6 O7 O8• Evidence for a treatment effect occurs when there

are abrupt changes in the time-series data at the time the treatment was implemented

Page 19: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

0

5

10

15

20

25

30

1970 1971

Percentage of cavities after the introduction of fluoride into toothpaste in 1970

Page 20: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

0

5

10

15

20

25

30

35

1967 1968 1969 1970 1971 1972 1973

Percentage of cavities after the introduction of fluoride into toothpaste in 1970

Page 21: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

0

5

10

15

20

25

30

35

1967 1968 1969 1970 1971 1972 1973

Percentage of cavities after the introduction of fluoride into toothpaste in 1970

Page 22: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

• This design helps to distinguish changes due to maturation from the quasi-independent variable

• Contemporary History: Observed effect could still be due to another event that occurred at the same time as the quasi-independent variable

• Perhaps the electric toothbrush was introduced in 1970, or there was a major TV add campaign that promoted brushing teeth.

Page 23: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

Interrupted time series with a reversal• Researchers measure the DV before and after the

treatment is introduced and then again after the treatment is removed

O1 O2 O3 O4 X O5 O6 O7 O8 -X O9 O10 O11

• We can see what happens to the DV after the quasi independent variable is introduced and then again after it is removed.

• If the quasi-independent variable was really having an effect we would expect performance to change back to normal after it is removed

Page 24: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

0

5

10

15

20

25

30

35

1968 1969 1970 1971 1972 1973

Percentage of cavities after the introduction of fluoride into toothpaste in 1970

Page 25: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

Limitations:• Researchers may not have the ability to remove the

quasi-independent variable• remove fluoride from toothpaste, remove a seatbelt

law

• Some effects of the quasi-independent variable may remain even after it is removed • if you did a time series study before and after introduction

of reading program, and then removed the program, reading may not decrease, the children may not regress because they did learn to read.

• Removal of the quasi-independent variable may produce effects that are not due to the quasi-independent variable

Page 26: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

Control Group Interrupted time series• Measure more than one group on several

occasions, but only one group receives the quasi-independent variable.

O1 O2 O3 O4 X O5 O6 O7 O8

O1 O2 O3 O4 -- O5 O6 O7 O8

• Helps to rule out history effects, and we can be more certain the a change was due to X rather than an outside influence.

• Could still have a local history effect.

Page 27: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

Program Evaluation

• Used to assess effectiveness of interventions (or programs) and provide feedback to the administrators

• Assess needs, process, outcome, and efficiency of social services.

Page 28: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

Descriptive Research (Leary, 2004)

• Describe patterns of behavior, thoughts, and emotions among a group of individuals.

• Provide information about characteristics about the sample rather than to test hypotheses.

1) Survey: most common type of descriptive research.

• Select a sample of the population using predetermined questions

• Surveys are usually questionnaires or interviews

Page 29: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

2) Demographic Research:

• describes patterns of life events and experiences like birth, marriage, employment etc.

3) Epidemiological Research:• study the occurrence of disease in groups of

people• Psychologists may study prevalence of

psychological disorders.

Page 30: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

Correlational Research(Leary, 2004)

• Examine whether variables are related to one another (whether they vary together).

Correlation coefficient: statistic indicating how well two variables are related to one another (how well they vary together) in a linear fashion.

• Must obtain a score on each variable for each participant.

• Pearson correlation coefficient (r): most common. Values range from -1.00 to +1.00

• The direction of the relationship is indicated by the sign of the correlation coefficient.

Page 31: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

• Positive correlation: indicates a direct, linear, positive relationship (as one variable increases the other variable also increases).

• Negative correlation: indicates a direct, linear, negative relationship (as one variable increases the other variable decreases)

• Magnitude of the correlation: the numerical value (ignoring the sign) which expresses the strength of the relation• Correlation of .33, indicates that the variables are not

as strongly related as variables with a correlation of .65

• The stronger the correlation the more tightly the data cluster around the mean

Page 32: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it
Page 33: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it
Page 34: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it
Page 35: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it
Page 36: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it
Page 37: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

• Two variables may be related in a curvilinear fashion.

• The correlation will be 0 but the variables may still be related in a non-linear way.

Page 38: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

• Experimental designs allow researchers to make cause and effect conclusions.

Three characteristics of a true experiment:• Researcher must vary at least one independent

variable and assess its effects on a dependent variable.

• Researcher must assign participants to experimental conditions in a way that ensures initial equivalence.

• Researcher must control extraneous variables that may influence the participants’ behavior.

Basic Issues in Experimental Research (Leary, 2004)

Page 39: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

Assigning participants to conditions:• Want to ensure that the participants are the same before they are

assigned to conditions, so effects are due to the manipulation of the IV and not due to pre-existing participant characteristics.

Between subject designs:• Simple random assignment: Each participant has an equal

probability of being placed in each condition.• Matched random assignment: Test the participants on a measure

related to the dependent variable and then assign to conditions by matching to ensure you have the same number of people who are high and low on the measure in each condition

Page 40: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

Within-subjects design• Repeated measures design: each participant completes all

conditions• No need for random assignment• Participants may participate in the experimental and control

group or in all the different levels of the independent variable• More powerful than between subjects design

• Because the participants serve as their own controls• Requires less participants (can have 30 who participate in

all three conditions, instead of 30 per condition making 90).

Page 41: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

• Order effects: the order in which the levels of the independent variable are received may affect the participant’s behavior

• If studying memory for words under different lighting conditions (each condition has more light) participants may be tired by the last condition which may reduce performance.

• Participants may show a practice effect in that they get better at the task in subsequent conditions.

Page 42: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

• Counterbalancing: A procedure in which the order of conditions in a repeated-measures design is arranged so that each condition occurs equally often in each order.

• Carryover effects: occurs when the effects at one level of the independent variable are still present at another level (condition).• Must ensure drug of one dosage wears off

before the next conditions started

Page 43: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

Internal validity threats:

• Biased assignment of participants to conditions: participants in each condition differ at the beginning, so differences in the dependent variable may reflect pre-existing differences among the participants rather than differences due to the independent variable

Page 44: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

A B B C

B C A B C B

C A A B B

C B A B B

A B B C

B C A B C B

C A A B B

C B A B B

A A BB C B B C

A A BB C B B C

A A AB B B B B

A B BB C C C C

Random Assignment

Biased Assignment

Page 45: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

• Differential attrition: participants who do not continue in the study (drop out). Attrition can occur at a different rate in the different conditions• Problematic when more participants drop out of

one condition as compared to the other condition

• People who drop out may be different than those who stay (more scared of experiment, less motivated).

• Pretest sensitization: taking a pretest may affect how participants behave in the experiment, so it is hard to determine whether effect is due to the pretest or the independent variable.

Page 46: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

• History: history effects can effect the DV• Testing anxiety in participants, perhaps a participant in

one group had just gone through a very anxious situation and may be more anxious already due to other factors than in the experiment.

• Maturation: Participants may change overtime, may be difficult to distinguish effect of the independent variable from maturation changes over time.• More problematic in research with children.

• Miscellaneous design confounds: due to participants being treated differently in different conditions, which results in confounding.

Page 47: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

• Experimenter expectancy effects: researchers may observe behavior in a biased way that reflects what they expect to happen. • Their expectations can distort the results

• Demand characteristics: participants may behave differently because of noticeable aspects of the experiment• They may be able to guess what the researchers are

researching and act accordingly.• Double-blind procedure: neither the participant nor the

researcher knows which condition a participant is in.• Helps to eliminate experimenter expectancy effects and

demand characteristics

Page 48: Issues in Research Reliability: The consistency and stability of a measure or score. Validity: the extent to which a measure actually measures what it

• Placebo Effects: an artifact that occurs when participant's expectations about what effect an experimental manipulation is supposed to have influence the dependent variable

• If participants think they are in a drug group they may be more likely to say the drug produced an effect.

• Placebo control group: receive a pill but with no drug, so participants do not know if they are truly receiving the drug