Research Methods All Slides

Research Design in Counseling

Psychology

Fall, 2014

Tuesdays, 1:00 to 3:50; 142 HEDCO Building

Instructor: Elizabeth A. Skowron, Ph.D.,

257 HEDCO Building

541-346-0913

eskowron@uoregon.edu

office hrs: after class & by appt.

Course Overview • Scientific Methods

• Ethical research practice

• Sampling, measurement, and methods of data collection

• Research designs o Experimental

o Quasi-experimental

o Correlational

o Longitudinal

• Types of validity & plausible threats

• Culturally competent research

• Randomized clinical trials

• Implementation (taking effective interventions to scale)

Course Overview

Scheduled

Weight

(in %)

Activity

All term 15 Class Preparation & Participation

Week 3 5 CITI Certification

Week 5 25 Exam I

Week 8 30 Exam II

All term 25 In-class/Homework Activities

(5@10 pts each)

Introductions

Three Programs

• Counseling Psychology

• Couples & Family Therapy

• Prevention Science

Introductions

Name __________________

Program ____________________

Interests ____________________

RATE YOUR Research self-efficacy (1 2 3 4) (a little – a lot)

Research skill/experience (1 2 3 4) (a little – a lot)

Ways of ‘Knowing’

• Method of tenacity o The beliefs I firmly adhere to are ‘true’

• Method of authority o If noted authorities (i.e., my father, the president, my therapist, my pastor) say it is so,

then it is ‘truth’

• ’A priori’ method o What makes sense is ‘true’

• Scientific method o What is discovered through empiricism is ‘true’

o Empiricism = the approach of collecting data and using it to develop, support, or challenge a theory

Science A dynamic view regards science as an activity

• Make discoveries

• Learn facts

• Advance knowledge

o Establish general laws & connect knowledge of separately known

events, make reliable predictions of events yet unknown

• Improve quality of life

The basic aim of science is discovery that leads to theory

Theory 1. a set of interrelated constructs, definitions, and propositions

2. presents a systematic view of phenomena by specifying relations among

variables,

3. its purpose is to explain and predict phenomena

The Research Process

(theory building & testing ~ inextricably related)

Theory Building & Theory Testing in

Research

Fundamentals of Scientific Exploration

1. Describe o What is happening? How does it occur?

o Identify and understand phenomena—special, meaningful events whose cause is in question—in order to reveal their underlying regularities

o Enables us to build models and construct theory to account for those regularities

2. Explain o Why is it happening? How are things interrelated?

o Involves revealing the nature and structure of phenomena and their operation in specific conditions

• Empirical pattern identification

• Theory testing

3. Predict o Speculate or test what will happen in the future, based on our (theoretical/empirical)

models for what happens and why it happens

4. (Influence)

The Scientific Method

• Make observations

• Ask questions about the observations

(i.e., frequency, association, causal)

• Form a hypothesis

• Design research study appropriate to test the hypothesis

• Collect data

• Analyze data

• Accept or reject the hypothesis o Accept ~ confirm theory

o Reject ~ reject or revise theory

Three Kinds of Conclusions To

Draw from Research • Frequency Claims

o Describe a particular rate or level of something

o Typically focus on 1 variable

o Variable is measured, not manipulated

• “More than 2 million U.S. teens depressed”

• “Half of Americans struggle to stay happy”

• “Almost 1 million children were abused or neglected last year”

• Associational Claims o Argue that 2+ variables are related (+/-)

o Involve at least 2 variables

o Variables are measured, not manipulated

• “Belly fat linked to dementia”

• “Laptop computer use linked to poor sperm quality”

• “Poor nutrition associated with school failure”

• Causal Claims o Argue that one variable causes another

o Study must have meet 3 criteria of covariation, temporal precedence, and internal validity

• “Music lessons enhance IQ”

• “PCIT reduces child maltreatment recidivism”

• “Debt stress causes health problems”

Three Rules for Causation

In order to make the claim that one variable CAUSES another variable, the following 3 conditions must apply:

1. Covariation o Two variables are associated. As A changes, B changes

(e.g., A= public service ad on parenting, B = child abuse)

2. Temporal Precedence o Cause precedes effect. A appears and then B follows; changes in A precede

changes in B

(e.g., public service ads appear on TV, then child abuse rates drop)

3. Internal Validity o Plausible alternative explanations for the results (i.e., 3rd variable threats) are

ruled out

o There are no likely alternative explanations for the change in B; A is the only thing that changed

Instructions for Registering and Completing

CITI RCR training

• Go to the https://www.citiprogram.org/ CITI Website

• New Users: Click on the “New Users Register Here” link. o From the “Participating Institutions” drop-down menu, select “University of Oregon” as your institution.

o Create your username, password and security question and answer.

o Enter your contact information.

• To complete the CITI course, you must complete all required modules and quizzes, achieving a minimum passing score of 80%. A quiz can be taken more than once to achieve this minimum score. You are not required to complete the course in one sitting. Your progress will be saved if you choose to stop the course and return at a later time.

• When you complete all required modules successfully, please print or download your completion report. A copy will be sent automatically to Research Compliance Services. Send a copy of your completion report to Dr. Skowron, at eskowron@uoregon.edu with the message topic “CITI training completed”. You can return to the CITI site at any time to obtain a copy of your completion report.

Next Week

• Research Continuum

• Variables & methods of

measurement

• Developing a research study

_________________

• Ethical conduct in research

• CITI training review

Psychology

Class 2

257 HEDCO Building

541-346-0913

Fundamentals of Scientific Exploration aka “the course of scientific progress”

• Description o What is happening? How does it occur?

o Identify and understand phenomena—special, meaningful events whose cause is in question—in order to reveal their underlying regularities

o Enables us to build models and construct theory to account for those regularities

• Explanation o Why is it happening? How are things interrelated?

o Involves revealing the nature and structure of phenomena and their operation in specific conditions

• Empirical pattern identification

• Theory testing

• Prediction o Speculate or test what will happen in the future, based on our (theoretical/empirical)

models for what happens and why it happens

• (Influencing)

Theory—Data Feedback Loop

• The basic aim of science is to understand/explain natural phenomena

• These explanations are called ‘Theories’ • Instead of trying to explain each and every separate behavior of children, we seek

general explanations that encompass and link together many kinds of (similar) behavior

• We formulate hypotheses based on our theory

• We collect data to test the hypotheses

• Data informs accuracy of our theory, and leads to revisions & modifications to theory

• We formulate new hypotheses

• We collect data to test the hypotheses

• AND SO ON…

The Contact-Comfort Theory

(Another example of Theory-Data Cycle)

Example: Theory—Data Loop • Repairing ruptures in the therapeutic alliance in psychotherapy

(Safran et al., 2011) • Roughly 50% of psychotherapy cases experience an alliance rupture

• Theory of how to repair alliance ruptures was constructed,

• Data collected via studies of psychotherapy process during sessions—what rupture/repair processes lead to + outcomes?

• Alliance ruptures & repairs defined & measured from client, therapist, & observer perspectives

• Rupture = disagreements about the tasks of therapy, goals of treatment, or strains in the client-therapist bond

• Pattern of rupture repairs linked with good outcomes, refined, retested…

• Common rupture-repair interventions: • Therapist acknowledges rupture,

• explores it with client,

• clarify misunderstandings,

• Therapist takes responsibility for his/her contribution,

• explore relational themes (in client’s life) associated with rupture,

• link therapy rupture to common patterns in client’s life,

• facilitate new experience

Research Continuum

Basic—Translational—Applied

• Basic: Pure research that advances fundamental knowledge about the human world. Focuses on refuting or supporting theories. The source of most new scientific ideas and ways of thinking about the world. It can be descriptive or explanatory.

• Translational: Research that applies findings from basic science to practical applications that enhance human health and well-being. Applying knowledge from basic research is a major stumbling block in science, partially due to compartmentalization of work based on expertise.

• Applied: Form of research involving the practical application of science.

Developing a Research Project

• Identify a topic or area of interest

• Formulate research ‘problem’

• Specify in terms of question re: relationship between 2+

variables

• Translate question into a testable hypothesis o Is it falsifiable?

• Design study to test your hypothesis

Identifying Research Topics

• Personal interests/experience

• Read journals

• Study theory ____________________________________

• Science must operate at the level of observation, and

gather data to test hypotheses o Requires us to move from the construct level to the observational level

• e.g., ‘early deprivation” and ‘learning problems’

o We have to define our constructs clearly enough so that observations are

possible

Operationalizing Research Topics

• Constructs: are concepts that cannot be directly observed

• Variable: is a symbol to which numbers or values are

assigned; can take on any set of values; can be dichotomous

to continuous o When operationally-defined, they are observable

• Operational definitions: assign meaning to a construct/

variable by spelling out what the investigator must do to

measure it o (1) measured: describes how the variable will be measured

o (2) experimental: spells out the details of the investigator’s manipulation of a variable

o Reinforcement schedule

o Intervention type & dosage

o No operational definition can ever reflect all of a variable…

Types of Variables 1. Independent and Dependent variables

o We are trying to explain the DV or predict the DV

o In correlational and/or experimental studies, we look for variation in the IV to predict the DV

o In experiments we manipulate the IV and look for effects on the DV

o Causal claims: IV is presumed cause of DV; IV is antecedent & DV is consequent

o Association claims: Variables may be called ‘predictor’ (IV) and ‘criterion’ (DV)

2. Active and Attribute variables o Active variables are manipulated (e.g., dose of prevention; experimentally-induced

stressor)

o Attribute variables cannot be manipulated, can only be measured

• (e.g., most human characteristics: ethnicity, age, sex…)

o Some attribute variables may also be active, depending on your design

• (i.e., anxiety…)

3. Continuous and Categorical variables o Continuous variables take on an ordered set of values (rank, interval, ratio scale)

o Categorical variables belong to a nominal scale of measurement (two or more subsets of sets are measured (i.e., political party membership, sex, college alma mater, religion, etc.)

Methods of Measurement 1. Self-report

o Participant makes an observation or report on self

o + : easy to administer, economical, accesses private thoughts, feelings, behavior not accessible to investigators

o -- : vulnerable to distortion, presume client insight/understanding about construct being measured

2. Other-report (parents, therapist, teacher, etc.) o Respondents rate the participant on some dimension(s)

o + : easy to administer, economical

o -- : potential systematic bias (e.g., cultural competence of rater – cross-cultural child development study)

3. Behavioral observations o Measures of overt behavior by trained observers using coding system

o + : direct and objective

o -- : presumption that observed behavior is representative; costly; feasibility?

4. Neurobiological indices

5. Interviews o + : flexible, high completion rate

o -- : costly; feasibility?

6. Unobtrusive measures o Assessment conducted without participants’ awareness

o + : eliminates reactivity to measurement

o -- : expensive?; some types are unethical 11

Writing Research Problems and Hypotheses

1. Work with your table group to brainstorm a list of interesting

research topics (tables).

2. Work with a partner to identify two ‘topics’ of interest to you from

the list of topics. State each of these ‘topics’ as a question about

the relationship between 2 variables (2 person groups).

3. Write a definition for each of your variables.

1. Identify IVs and DVs; active vs. attribute variables

2. What methods can be used to measure each of your

variables? (e.g., self-report, observation, performance, other’s report—

teacher/parent/spouse, other)

4. Discuss in class

Goal of the Ethical Research

to create new knowledge (beneficence)

while preserving the dignity and welfare of

participants (non-maleficence &

autonomy)

Ethical scholarship

As researchers, we have responsibility to seek and share accurate information in our scholarly endeavors, in:

1. Executing a research study o Respect Ss’ rights, conduct study carefully, minimize bias in methods & measures, &

ensure both data & analyses are error-free

o Maintain raw data for 5+ years post-publication

2. Reporting our results o Accurately, honestly, note limitations…& guard against misuse of results

o “the facts are always friendly” Carl Rogers

3. In presentations & publications o Avoid duplicate/piecemeal publication

o Clearly identify multiple publications from same data set

4. Giving accurate publication credit Major contributions = authorship

• formulate research question/hypotheses, design study, conduct analyses, write manuscript

Minor contributions = footnote (i.e., editing, collect data, code data, clerical work

Publication Credit In case of student thesis or dissertation, APA guidelines

state that “except under exceptional circumstances, a

student is listed as principal author on any multiple-

authored article that is substantially based on the student’s

doctoral dissertation”

Plagiarism 1. Omitting necessary citations

2. Failing to cite relevant work

3. Verbatim copying of another’s writing

FIX: Give credit where/when it is due

History of unethical treatment of

research participants

• Nazi prison camp experiments

• Nuremberg Code o Basis for first guidelines regarding ethical treatment of research participants

• Tuskegee Syphilis Study (1932-1972) o Whistle-blower ends study

• 1974: Code of Federal Regulations implemented Public Law 93-348 (rev. 1983) o establishing Institutional Review Boards (IRBs) to protect human participants in

biomedical & behavioral research

Ethical Violations • Tuskegee Syphilis Study

• Researchers lied o Told men they were being treated, but none were

o Conducted painful spinal taps to track disease progression, but told men it was a “special free treatment”

• Withheld information o Men who contracted the disease were not informed

o 1947: penicillin discovered as cure, but this fact was not shared with participants

• Actively interfered with men’s efforts to get treatment

• Acts prevented men from serving in armed forces and benefiting from GI bill and benefits

• 1969: PHS employee blows whistle, no action, 1972 breaks story to Associated Press

• 1972: Study ends 17

In 1932, U.S. Public Health Service in

cooperation with the Tuskegee Institute

began a 40-year study of 600 Black

men to understand effects of syphilis

on health over time

• 400 already infected

• 200 were not

The Belmont Report: Each Principle

Has an Application

• Respect for persons – Informed consent – Protection of vulnerable populations

• Beneficence – Cost-benefit analysis for participants – Cost-benefit for society

• Justice – How are participants selected? Do they

represent the people who will benefit from the study?

Beneficence: Cost-Benefit Balance

low risk high risk

benefit

enefit

Do the study

Don’t do

the study

Do the

study?

Do the

study?

Risk to participants

Benefit

APA Guidelines for ethical research practice

Guiding principles

1. Non-maleficence o First, do no harm

2. Beneficence o Do good & give back to the community

3. Justice o Fairness, including rewards for one’s labor

4. Autonomy o Right to voluntarily participate or decline to

o Underpins ‘informed consent’

5. Fidelity o Faithfulness, loyalty, keeping promises to maintain

confidentiality, etc.

Respect

IRB Guidelines for Ethical Treatment of

Research Participants 1. Risks and Benefits

o ID risks & work to eliminate or minimize these; protect SS from harm

o ID potential benefits to SS; clarify benefits for whom?

o Weigh the balance of risks-benefits

o Pilot all new procedures, measures

2. Informed Consent o Give SS a fair, clear, explicit summary including risks & benefits, then seek consent to

participate

o Obtain assent from children

o Consider ability to provide consent – mental competence, etc.

o Voluntariness: consent must be free of any coercion (i.e., students, institutionalized persons, client status, etc.)

o Document

3. Deception & Debriefing o Involves deliberate withholding of info or providing misinformation to SS (i.e., Cole et

al.’s Disappointment task)

o Additional responsibilities & safeguards are required with use of deception

IRB Guidelines for Ethical Treatment of

Research Participants

4. Confidentiality & Privacy o Protect any information that a SS shares during the study

o Concern for well-being may necessitate

• Any exceptions are clearly stated (i.e., harm to self/others)

o Anonymity = no identifiers can link you to your data

5. Treatment issues o (withholding effective treatment, deception)

o Great concern when withholding a treatment known to be effective

• Strategies: wait-list & delayed treatment groups; contrast with treatment as

Instructions for Registering and Completing

CITI RCR training

• Go to the https://www.citiprogram.org/ CITI Website

• New Users: Click on the Register” link under Create an Account.

o Start typing “University of Oregon” as your organization and click the option when it appears.

o Enter your contact information

o Create your username, password and security question and answer.

o The next step involves optional collection of demographic information. Answer as you prefer and continue to the next step.

o Answer “No” regarding professional continuing education requirements (Not applicable to RCR users)

o Complete required questions in the next step, regarding institutional e-mail address, gender, etc.

o Skip the Human Subjects Research question and move on to the Responsible Conduct of Research (RCR) training question

o Select the RCR course most appropriate to your research discipline (i.e., social and behavioral sciences) and your status at the University (undergraduate student, graduate student, or postdoctoral researcher). If you have any questions regarding which course you should take, please contact me.

o The remaining courses do not apply to the RCR training. Click the “Complete Registration” button at the end.

• To complete the CITI course, you must complete all required modules and quizzes, achieving a minimum passing score of 80%. A quiz can be taken more than once to achieve this minimum score. You are not required to complete the course in one sitting. Your progress will be saved if you choose to stop the course and return at a later time.

• When you complete all required modules successfully, please print or download your completion report. A copy will be sent automatically to Research Compliance Services. Send a copy of your completion report to Dr. Skowron, at eskowron@uoregon.edu with the message topic “CITI training completed”. You can return to the CITI site at any time to obtain a copy of your completion report.

In-Class Activity 1

Ethical Concerns in Human Subjects Research

1. A prevention science researcher applies to an IRB, proposing to

observe children ages 2 to 10 eating their meals and playing in the

local McDonald’s play area. Because the area is public, the

researcher does not plan to ask for informed consent from the

children’s parents.

• What ethical concerns exist for this study?

• What questions might an IRB ask?

2. A psychologist plans to hand out surveys in her 300-level

undergraduate class. The survey asks about student study habits and

substance use. The psychologist does not ask the students to put

their names on the survey; instead, students will put completed

surveys into a large box at the back of the room. Because of the low

risk involved in participation and the anonymous nature of the survey,

the researcher requests to be exempted from formal informed consent

procedures.

• What ethical concerns exist for this study?

• What questions might an IRB ask?

3. Discuss in class & submit for grading 24

Technical function of good research

design = To control variance (attend to the 4 validities)

MAXMINCON (Kerlinger, 1973, 1986)

Maximize systematic variance Maximize variance of the variables in your substantive research hypothesis

Experimental variable: make conditions as different as possible

Associational variable: seek wide range of scores/levels as possible

Minimize error variance Reduce the errors in measurement of your constructs and increase the reliability

of your measures

Control extraneous variance Control variance of extraneous or unwanted variables that may effect or relate to

your variables of interest

3 ways to control these

MAX ‘Maximize systematic variance’

Dependent variable:

Emotion dysregulation

Emotion Dysregulation

lence e

xposure

MIN ‘Minimize error variance’

Give the systematic variance (the stuff you’re interested in)

a chance to show itself

1. Sources of error variance (errors in measurement): Guessing, fatigue over time, momentary inattention, variation in responses from

trial to trial

Solutions:

2. (Un)reliability of measures: Consistency in measurement across items, raters, time, etc.

MIN • Reliability of your measures will constrain the strength of

association you can observe between the variables of

interest (e.g., Ghiselli et al., 1981)

• ryy = reliability of the y scores

• rox,oy = observed correlation between x and y

• rtx,ty = true correlation between x and y

Correction for

attenuation

30 MacCoun, 2006

MIN If our dependent variable measure is unreliable, it will

drastically underestimate the true x – y relationship.

CON “Control Extraneous Variance”

Identify plausible ‘3rd’ variables and control their influence on your study variables of interest in 1 of 3 ways

Principle 1: To eliminate the effect of a possible influential ‘3rd’ variable on a dependent variable, chose participants so that they are as homogeneous as possible on that ‘3rd’ variable

Principle 2: Whenever possible, randomly assign participants to experimental groups and conditions

Principle 3: control the effects of a ‘3rd’ variable by building it into the research design as an attribute variable that is measured and then statistically controlled

Principle 3a: Match participants across conditions or groups by splitting a variable into 2 or more parts, then randomize within each level

CON • Extraneous ‘3rd’ variables to control for…?

Dependent variable:

Emotion dysregulation

Violence exposure

Child Age

Caregiving

Next week (Morling Ch. 3, 14; CITI)

____________________________

3 Claims

Research Designs

4 Validities

Psychology

Class 3

257 HEDCO Building

541-346-0913

Week 3

________________

3 Claims

4 Validities

________________________________

Next Week: Research Designs

Validity Issues in Research Design

To draw valid conclusions about research

questions, we must design studies to

minimize the potential for alternative

explanations of the results

Three Claims

Three Claims • Frequency claims

• Association claims (types of associations)

• Causal claims

Practice Identifying Claims

a. Worry may make women’s brains work overtime.

b. High “normal” blood sugar may still harm brain.

c. Want a higher GPA? Go to a private college.

d. Those with ADHD do one month’s less work a year.

e. When moms criticize, dads back off baby care.

f. Report: 16% of teens have considered suicide.

g. MMR shot does not cause autism, large study says.

h. Breastfeeding may boost children’s IQ.

i. Breastfeeding rates hit new high in United States.

j. Smiling may lower your heart rate.

k. OMG! Texting and IM-ing doesn’t affect spelling!

l. Facebook users get worse grades in college.

m. Mother’s heartburn means a hairy newborn.

Practice Identifying Claims

a. Indicate if the claim is frequency, association, or

cause.

b. For each claim, identify the variable(s).

c. For each variable, is it manipulated or measured?

d. State each variable at the conceptual level.

e. State each variable in terms of its operational

definition: How might it have been operationalized?

Interrogating the Three Claims

Using the Four Big Validities

Four (Big) Validities

Statistical Conclusion Validity: Are the variables actually statistically related?

Is the statistical test able to detect small associations/small differences (i.e., small

effects)?

Internal Validity (most relevant in studies that test for causal relations)

The extent to which observed changes in a DV are attributable to/caused by an IV?

What 3 conditions need met to establish a causal relationship? (REVIEW)

Construct Validity Do the measured variables reflect the actual constructs of interest?

Are all important aspects of the constructs represented in the study variables?

External Validity Are the study results applicable (i.e., generalizable) to other groups, settings, time-

frames?

Threats to Validity

Threats to Statistical Conclusion Validity

• Low statistical power

• Violated assumptions of your statistical tests

• “Fishing” and error rate problems

• Unreliability of measures/treatment implementation (MIN)

• Restriction of range (MAX)

• Extraneous variance (3rd variable threats) (CON)

Threats to Statistical Conclusion Validity • Low statistical power

o Statistical power = probability of finding a relationship or effect when it really exists (i.e., power to find a true effect)

o Type II Error = risk of failing to find a relationship (significant effect) that really exists

• Steps to increase statistical power 1. Use a larger sample size

2. Increase the effect size (MAX your systematic variance)

3. Decrease noise (MIN your error variance)

• “Fishing” and error rate problems • Conducting lots of analyses on a data set and treat each as independent

• In stats analyses, we use p < .05 level of significance • Result we obtain in our study is expected to occur by chance in only 5 X out of

every 100 times we run the analysis

• Odds are 5 out of 100 that we will see a relationship (i.e., significant effect) even if none exists

o Query: What are the chances of finding a significant effect if you:

• conduct 10 separate tests with your data? p = ______

• 20 separate tests with your data? p = ______

• Solution: o Adjust the error rate (i.e., p-value, significance level) to reflect the number of analyses

you plan to conduct

o ‘Experiment-wise’ p = ____.05____ N tests = 4, experiment-wise p = _.0125__

N of tests = 6, “ p = _.008__

• Unreliability of measures/treatment implementation

If measurement of variables

measure is unreliable,

it will drastically underestimate

the true x – y relationship.

18 MacCoun, 2006

• Restriction of range (MAX)

Extraneous variance

3rd variable threats must be identified (CON)

• Control their influence on your study via…

Principle 1: To eliminate the effect of a possible influential ‘3rd’ variable on a dependent variable, chose participants so that they are as homogeneous as possible on that ‘3rd’ variable

Principle 2: Whenever possible, randomly assign participants to experimental groups and conditions

Principle 3: control the effects of a ‘3rd’ variable by building it into the research design as an attribute variable that is measured and then statistically controlled

Principle 3a: Match participants across conditions or groups by splitting a variable into 2 or more parts, then randomize within each level

Threats to Internal Validity

Compromise our confidence in assertions that a relationship/effect exists between the independent and dependent variables.

• History

• Maturation

• Statistical regression (law of initial values)

• Selection

• (Differential) Attrition

• Testing

• Instrumentation

• Compensatory equalization of treatments

• Resentful demoralization

• Treatment diffusion

History: Did some unanticipated event occur while

the experiment was in progress and did these

events affect the dependent variable?

A threat for the one-group design, but not for two-group designs

In the one-group pre-test post-test design, the effect of the treatment

produces the difference in the pre- and post-test scores. This difference

may be due to the treatment or to history.

History:

Not a threat for two-group designs (i.e., treatment/experimental group vs.

comparison/control group).

If the history threat occurs for both groups, the difference between the

two groups will not be due to the history event.

Maturation: were changes in the dependent

variable due to normal developmental processes

operating within the participant as a function of

Is a threat for the one-group design.

Is not a threat for the two-group design, assuming that participants in

both groups change (‘mature’) at the same rate.

Examples: Threats to Internal Validity

History: In a short intervention designed to

investigate the effect of computer-based self-control

instruction, participants missed some instruction

because of a power failure at the school.

Maturation: the performance of 1st graders in a

learning experiment begins decreasing after 45

minutes due to fatigue

Statistical regression: An effect that is the result

of a tendency for participants selected on the

bases of extreme scores to regress towards the

mean on subsequent tests.

When measurement of the dependent variable is not perfectly reliable,

there is a tendency for extreme scores to regress or move toward the

mean over time.

The amount of regression to the mean is inversely related to the

reliability of the test.

Statistical regression:

In a study of family therapy, participating children grouped

because of high anxiety scores show considerably greater

reductions in anxiety than do the groups who scored average

and low on anxiety at the pre-test.

Selection: Refers to selecting participants for the

various groups in the study. Are the groups

equivalent at the beginning of the study?

This is not a threat in studies that employ random sampling and random

assignment. All participants have an equal chance of being in the

treatment or comparison groups and the groups are equivalent.

Were participants self-selecting into experimental and comparison

groups? This could compromise the internal validity of the study.

Selection is not a threat for the one-group design but is a threat for the

two-group design.

Differential Attrition: Differential loss of

participants across groups.

Did some participants drop out? Did this affect the results?

Did about the same number of participants make it through the entire

study in both experimental and comparison groups?

This is a threat for any design with more than one-group.

• Testing: Did the pre-test affect scores on the post-

o A pre-test may sensitize participants in unanticipated ways and their

performance on the post-test may be due to the pre-test, not to the

treatment, or more likely, an interaction of the pre-test and treatment.

o This is a threat for one-group designs.

o Not a threat for two-group designs. Both groups are exposed to the pre-

test and so the difference between groups will not be due to testing.

Selection: The experimental group in a study of self-control consisted of a high-ability class, while the comparison group was an average-ability class.

(Differential) Attrition: In a health-promotion intervention designed to test the effect of various exercises, those participants who dislike exercise most, stopped participating.

Testing: In an experiment with logical reasoning performance as the dependent variable, a pre-test familiarizes the participants with the post-test and how to perform well

• Instrumentation: Did any change occur during the

study in the way that the dependent variable was

measured? o Is a threat for one-group designs, not for the two-group designs.

o Why? _________________________________

• Treatment diffusion: Did the comparison group

know or find out about the experimental/

intervention group and what transpired? o A threat for two-group designs.

• Instrumentation: Two research assistants for a

self-control experiment with preschoolers

administered the post-test with different

instructions and procedures.

• Treatment diffusion: In an intervention study to

enhance college student adjustment, students in

the treatment and placebo control groups ‘compare

notes’ about what they are learning in sessions.

Threats to Internal/Construct Validity

• Compensatory equalization or rivalry: These simply weaken or strength the effect sizes associated with the intervention.

• Resentful demoralization: If participants learn that their group receives less desirable goods or services, they may feel resentful, demoralized and perform particularly low on the dependent variable. o What effect would this have on treatment vs. control group differences

_____________________________?

May increase magnitude of group differences, leading to an overestimate of the effect

Threats to Construct Validity

• Inadequate explication of the constructs

• Construct confounding

• Mono-operation bias

• Mono-method bias

Threats to Construct Validity

• Are all important aspects of the constructs represented in the independent and dependent variables? o If yes, good

o If not, the constructs are underrepresented

• Do the independent and dependent variables also represent constructs that are not of interest in the study? o If yes, there are surplus construct irrelevancies

o If no, good

• Inadequate explication of the constructs

• Construct confounding

Threats to Construct Validity Campbell and Fiske (1959) proposed two kinds of construct-validation evidence:

1. evidence of convergent validity o evidenced by achieving similar results (convergence) across different measures of the

same construct or different manipulations of the same construct. In other words, your measure of binge drinking would be expected to correlate with other existing measures of binge drinking and similar constructs, such as ____________ and __________________.

2. evidence of discriminant validity

o evidenced by observing no associations between your measure of ___________ and measures of other, unrelated constructs, such as ___________________ and ____________________. For example we would expect that your measure of binge drinking would not be correlated with unrelated constructs, such as __________ or _____________.

• Mono-operation bias

• Mono-method bias

• http://www.youtube.com/watch?v=1Y3v5dgWlWM

External Validity

External validity refers to the degree to which the results of

an empirical investigation can be generalized to and across

individuals, settings, and times

External validity can be divided into

Population validity

Ecological validity

External Validity

Population Validity:

How representative is the sample of the population?

The more representative, the more confident we can

be in generalizing from the sample to the population.

How widely does the finding apply? Generalizing

across populations occurs when a particular research

finding works across many different kinds of people,

even those not represented in the sample.

External Validity

Ecological Validity is present to the degree that a result

generalizes across settings. Types include:

Interaction effect of testing

Interaction effects of selection biases and

experimental treatment

Reactive effects of experimental arrangements

Multiple-treatment interference

Experimenter effects.

Threats to External Validity

• Interaction of selection and treatment o A characteristic of the treated group that interacts with the treatment

o Randomization would correct

• Example: o An experimental evaluation of a new teaching method is conducted in a sample

of low achieving students.

o Results will not generalize to a sample of students with heterogeneous

abilities/achievement levels

Summary: External Validity

It’s a population, not the population

External validity comes from how, not how many.

Just because a sample comes from a population doesn’t

mean it generalizes to that population.

To Be Important, Must a Study Have

External Validity?

• Generalizing to other participants

• Generalizing to other settings

• Does a study have to be generalizable to many people?

• Does a study have to take place in a real-world setting?

Does a Study Need to Be

Generalizable to Many People?

Generalization mode

– Frequency claims

– Goal is to make a claim

about a population

– Real-world matters

External validity is essential!

Theory-testing mode

– Association and causal claims

– Goal is to test a theory rigorously, isolate variables

– Prioritize internal validity

– Artificial situations may be required

– Real world comes later

External validity is not the priority!

Does a Study Have to Take Place in a

Real-World Setting?

Theory-testing mode often requires

artificial settings.

Even laboratory settings can feel

emotionally real. – Experimental realism

Prioritizing Validities • Which validity is appropriate to

interrogate for every study?

• Which validities are not always relevant for a study?

• Why can’t researchers achieve all four validities in a single study?

• Which two validities are most often in trade-off?

• Which validity is most under the researcher’s control?

That study’s just not valid!

In-Class Activity 2

Return to 2 of your research topics of interest.

a. For 1 of your topics of interest, construct a research question that is framed as:

• A frequency claim

• An associational claim

• A causal claim

b. For each research question (3 total) prepare an operational definition of your constructs (i.e., measureable variables) and specify the method you will use to measure the construct

• Identify your IVs and DVs, and note whether your variables are Active or Attribute variables, and Continuous or Categorical variables

c. Restate each of your research questions as directional relationships between your measured constructs

d. Identify at least 3 possible threats to validity that may be relevant to the studies you design to test the associational and causal claim questions. (Select at least 2 threats to internal validity.) How could interpretation of your findings be impacted by each of these potential threats?

To Be Important, a Study

Must Be Replicable

Replication Studies

• Direct replication

• Conceptual replication

• Replication-plus-extension

• Meta-analysis

Replication

Replication Studies

Direct replication Same variables, same

operationalizations

Conceptual replication Same variables, different

operationalizations

Replication-plus-

extension Same variables, plus some

new variables

How Meaningful Is That Effect Size?

The question is,

is the study

valid?

That is not a valid study.

Say this: Not that:

How’s the construct validity?

Is external validity

relevant here?

Can the study support

a causal claim?

Week 4

Research Designs ________________________________

oPre-experimental

oExperimental

oQuasi-experimental

Psychology

Class 4

257 HEDCO Building

541-346-0913

office hrs: Tues 12-1 pm

Review In class activity #2

In-Class Activity 2

Return to 2 of your research topics of interest.

a. For 1 of your topics of interest, construct a research question that is framed as:

• A frequency claim

• A causal claim

b. For each research question (3 total) prepare an operational definition of your constructs (i.e., measureable variables) and specify the method you will use to measure the construct

• Identify your IVs and DVs, and note whether your variables are Active or Attribute variables, and Continuous or Categorical variables

c. Restate each of your research questions as directional relationships between your measured constructs

d. Identify at least 3 possible threats to (internal) validity that may be relevant to the studies you design to test the associational and causal claim questions. Why might these potential threats be an issue with tests of your research question?

Research Designs

• Pre-Experimental

• Experimental

• Quasi-Experimental

• Conducted in the lab vs. in the field

• Making associative or causal claims

Three Criteria for Causation • Covariance

• Temporal precedence

• Internal validity

Research Design

Random Assignment

Experiment

Quasi-

experimental

One or more IVs are

manipulated?

yes no

IV manipulated?

In-Class Activity 3

List your research questions that frame

• A causal claim

a. Use an experimental design to test your causal research question • Identify one IV and one DV

• Select and describe your design choice

o Which threats to internal validity does it control and why?

o List 1 threat to external validity that exists and why.

o Explain how each threat would impact interpretation of your findings.

b. Use a quasi-experimental design to test your research question • Identify one IV and one DV

o Which 2-3 threats to internal validity does it NOT control and why?

Pre-Experimental Designs

• Heppner, Kivlighan, & Wampold

(2008) refer to these three

designs as “uninterpretable”

• Multiple threats to internal

validity of these studies

No way to infer that any change has taken

place; maturation & history can’t be ruled

out because no control group was used.

Great difficulty attributing results to the

intervention. Groups could differ in many

different ways beyond treatment effects.

Can’t discern those possible differences.

Better than one-shot case study, because

we can determine if change occurred.

Cause of change remains ambiguous.

Pre-Experimental Designs

• One shot case study

• One group pretest-posttest study

• Static group comparison study

Research Design

Random Assignment

Experiment

Quasi-

experimental

One or more IVs are

manipulated?

yes no

IV manipulated?

Experimental Designs

Pretest-Posttest Control Group Design

R O1 X O2

R O1 O2

Posttest–Only Control Group Design

R X O2a

Solomon Four-Group Design

R O1 X O2

R O1 O2

R X O2

R randomization

O1 pretest

O2 posttest X intervention

Randomize participants to 2+ groups (1

treatment & 1 no-tx, i.e., control). Both

groups get a pre- and post-test. Enables

test of X on O2, reflected in the differences

observed across groups.

Pretest: helps clarify source of diff

attrition, strengthens stat test by

controlling for pre-tx differences in the DV;

assist in testing moderation effects

Randomize participants to 2+ groups (1

treatment & 1 no-tx, i.e., control). Both

groups get a post-test. Enables test of X

on O2a. Less time, expense, & avoid

repeated testing.

Used when there are

concerns about the effect

of a pretest on participants.

Added value is ability to

examine effects of pretest.

Controls for most threats

to internal validity. Is

costly in time & resources.

Experimental Designs

• Pretest-posttest control group design

• Posttest only control group design

• Solomon four-group design

Research Design

Random Assignment

Experiment

Quasi-

experimental

One or more IVs are

manipulated?

yes no

IV manipulated?

Quasi-Experimental Designs

• No randomization

• One or more IVs are experimentally-manipulated

4 reasons to select these over a true experimental design 1. Cost

2. Sample selection

3. Ethical considerations

4. (un)Availability of suitable control groups

• Three good non-equivalent

groups designs

Nonrandom assignment to groups. Pretest

enables us to assess for similarity of

participants on the DV (though groups

won’t be similar on other 3rd variables).

Selection may still be a threat. Less time,

expense, & avoid repeated testing.

Enables us to clarify and control for

maturation effects. Must deal with the

autocorrelations in data when they are

analyzed.

Strengthens 1st design by adding another

pretest. Clarify whether maturation is

different across groups.

• Pretest-posttest nonequivalent groups

• Time series designs

• Nonequivalent before-after design

Technical function of good research

design = To control variance (attend to the 4 validities)

MAXMINCON (Kerlinger, 1973, 1986)

Maximize systematic variance Maximize variance of the variables in your substantive research hypothesis

Experimental variable: make conditions as different as possible

Associational variable: seek wide range of scores/levels as possible

Minimize error variance Reduce the errors in measurement of your constructs and increase the reliability

of your measures

Control extraneous variance Control variance of extraneous or unwanted variables that may effect or relate to

your variables of interest

3 ways to control these

In-Class Activity 3

List your research questions that frame

• A causal claim

a. Use an experimental design to test your causal research question • Identify one IV and one DV

o List 1 threat to external validity that exists and why.

b. Use a quasi-experimental design to test your research question • Identify one IV and one DV

o Which 2-3 threats to internal validity does it NOT control and why?

Exam 1

• Due Tuesday, October 28thth, 2014 by 4:30 PM

o Located in Blackboard Research Design, in Assignments, named “Exam 1”

o Exam goes live Wed 8:00 am and closes following Tues 4:30 PM

• Multiple choice

• Short answer / essay questions (prepare in paragraph format using APA

style)

• “No backtracking” option enabled

Review In class activity #3

Psychology

Class 6

257 HEDCO Building 541-346-0913 eskowron@uoregon.edu

Measurement

Data Collection

Sampling

Operationalizing Study Variables

Measurement

Constructs: are concepts that cannot be directly observed Variable: is a symbol to which numbers or values are assigned; can take on any set of values; can be dichotomous to continuous

– When operationally-defined, they are observable

Operational definitions: assign meaning to a construct/ variable by spelling out what the investigator must do to measure it

– (1) measured: describes how the variable will be measured – (2) experimental: spells out the details of the investigator’s

manipulation of a variable – Reinforcement schedule – Intervention type & dosage

Construct definition & operationalization

Each construct has only one Conceptual definition (i.e., researcher’s definition of the variable at an abstract level) Each construct may have multiple Operational definitions (i.e., representing the researcher’s specific decision about how to measure or manipulate the variable)

Conceptualizing Race, Culture, & Ethnicity

• Race – ‘The presumed classification of all human groups on the basis of

visible physical traits or phenotype and behavioral differences’ (Robert Carter, 1995)

– Not a biological reality – A social construct used to categorize people – Referenced to perpetuate power differences and social inequalities

• Ethnicity – One’s national origin, religious affiliation, or other type of socially or

geographically-defined group (Carter, 1995)

• Culture – The values, beliefs, language, rituals, traditions, and other behaviors

that are passed down from one generation to another within any social group (Helms & Cook, 1999).

Methods of Measurement 1. Self-report

– Participant makes an observation or report on self – + : easy to administer, economical, accesses private thoughts, feelings, behavior not accessible to

investigators – -- : vulnerable to distortion, presume client insight/understanding about construct being measured

2. Other-report (parents, therapist, teacher, etc.) – Respondents rate the participant on some dimension(s) – + : easy to administer, economical – -- : potential systematic bias (e.g., cultural competence of rater – cross-cultural child development study)

3. Behavioral observations – Measures of overt behavior by trained observers using coding system – + : direct and objective – -- : presumption that observed behavior is representative; costly; feasibility?

4. Neurobiological indices 5. Interviews

– + : flexible, high completion rate – -- : costly; feasibility?

6. Unobtrusive measures – Assessment conducted without participants’ awareness – + : eliminates reactivity to measurement – -- : expensive?; some types are unethical

Operationalizing the Independent Variable

Those you can manipulate (i.e., active IVs)

1. Determining conditions of the IV – Referred to as levels of the IV, groups, categories, and

treatments interchangeable terms – These are often categorical variables (but don’t have to

b…) – Conditions of the IV are determined by YOU, the

researcher…bc they are manipulated

2. Adequately reflecting the constructs of interest – Your IV must be well-defined and operationalized – See ‘psychometrics’ section below

3. Limiting differences between conditions – Try to make sure that the different conditions of the IV

differ only on the dimension of interest (e.g., math problem difficulty groups-easy, moderate, hard) and not other dimensions (how much tutoring was available….etc.)

4. Establishing the salience of differences in conditions ______________________ Manipulation checks

– Used to verify that the conditions of the IV • differed as intended • Didn’t differ on other dimensions • And that treatments were implemented in the intended fashion 9

Those you cannot manipulate, i.e., attribute IVs, aka ‘status’ variables in HWK

• Statistical tests with these variables are used to detect associations

• FYI: Stats used in tests of associational and causal claims are basically the same, but it is more difficult to draw causal inferences with status variables because they are not manipulated

• IT IS THE RESEARCH (STUDY) DESIGN, NOT THE STATS ANALYSIS USED, THAT DETERMINES THE INFERENCE STATUS OF THE STUDY

– i.e., associational claims vs. causal claims, etc.

Operationalizing the Dependent Variable

• Have a rationale for why you selected the DVs of choice, and not others, and why you operationalized the DV in the manner you did

• e.g., Webster-Stratton (1988) parent-report of child behavior is a function of parent psychopathology

• Orlinsky et al. (1994) psychotherapy outcome ratings differ per therapist, client, and observer ratings

1. Insure measure used to operationalize the DV is psychometrically-strong (i.e., good reliability & validity)

2. Consider role of reactivity in DV assessment 3. Consider other procedural issues with DV assessment

– Administration time – Order of presentation – Reading level

Scales of Measurement

Categorical scales

• Nominal (i.e., categorical) – A scale with numerical values that represent categories of an attribute or "name" the attribute

uniquely – e.g., sex or ethnicity – NOTE: You cannot subject nominal scale measures to the same statistical tests that other three

Quantitative scales • Ordinal

– measurement of some the attributes that can be rank-ordered – e.g., years of schooling completed

• Interval – Measurement that is rank-ordered AND the distance between locations on the scale do have

meaning – e.g., measurement of temperature in Fahrenheit or Celsius (40 degrees is twice as hot as 20

degrees)

• Ratio – Measurement that is rank-ordered AND the distance between locations on the scale do have

meaning and there is an absolute zero that is meaningful – e.g., number of study participants who re-abused their children following treatment

Measurement Activities

1. Classify each operational variable below as categorical or

quantitative. If the variable is quantitative, further classify it

as ordinal, interval, or ratio. a) Number of books a person owns b) A book’s sales rank on amazon.com c) Location of a person’s hometown (urban, rural, or

suburban) d) Nationality of the participants in a cross-cultural study

of Canadian, Ghanaian, and French students e) A student’s grade in school

Psychometrics

• Reliability of measures

• Validity of measures

• Relationship between R & V

Reliability

• Internal consistency: the extent to which items within a test are similar or ‘hang together’

• use a single instrument administered to a group of people on one occasion • Compute Cronbach's Alpha: an index of intercorrelations between all items on test • Reliability estimates = .70 or higher indicate very good reliability

• Inter-rater: degree to which different raters/coders give consistent ratings/scores of the same phenomenon

• Two or more raters code same phenomenon • Categorical measures:

– Calculate the percent of agreement between the raters – Adjust this for ‘chance agreement’ using kappa coefficient

• Continuous measures: – calculate the correlation between the ratings of the two observers

• Test-retest: consistency of a measure from one time to another • Use a single instrument administered to a group of people on two+ occasions • Calculate a correlation • Is this best used for measures of constructs that are State or Trait-like? Stable or shifting

over time? Why…? • Shorter the time gap, the higher the correlation; the longer the time gap, the lower the

correlation

2. For each measure below, indicate which kinds of reliability would be

appropriate to evaluate.

a) Researchers place unobtrusive video recording devices in the living rooms of

20 children. Later, coders view tapes of the living areas and code how many

minutes each child spends playing video games. b) Clinical psychologists have developed a seven-item self-report measure to

quickly identify people who are at risk for post-traumatic stress disorder. c) Psychologists measure how long it takes a mouse to learn an eye-blink

response. For 60 trials, they present a mouse with a distinctive blue light

followed immediately by a puff of air. The 5th, 10th, and 15th trials are test

trials, in which they present the blue light alone (without the air puff). The

mouse is said to have learned the eye blink response if observers record that it

blinked its eyes in response to a blue light test trial. The earlier in the 60 trials

the mouse shows eye-blink response; the faster it has learned the response. d) A restaurant owner uses a response card with four items in order to evaluate

how satisfied customers with the food, service, ambience, and overall

experience. Each item is scaled from one to four stars. e) Educational psychologists use teacher ratings of classroom shyness (on a nine-

point scale, where 1 = “not at all shy in class” and 9 = “very shy in class”) to

measure children’s temperament.

Validity (of measures)

• Physical science is fortunate to have standard measurements

– e.g., Platinum-iridium bar kept at U.S. NIST – international standard for length of 1 meter

– I can compare my 1 meter ruler to this standard and know if it measures what it’s supposed to measure

– No such luck in the social sciences…our constructs are typically not directly observable (i.e., anxiety, happiness, self-regulation)

– No way to directly measure these constructs – We work with estimations (via self report, observed behavior, neurobiological

measures, other’s reports, etc.)

• Construct validity = to what extent is our measure of X really tapping into it?

– Definition: to what extent does this test/measure (i.e., an operationalization) accurately reflects the construct it’s intended to measure?

CONTENT VALIDITY

The measure contains all parts that your

theory says it should contain

Four Empirical Ways to Assess

Validity

Reliability Do you get consistent

scores every time?

Measurement

(construct) Validity

Does it measure what you intend to measure?

Two subjective ways

to assess validity

Predictive validity

Your measure is correlated with a relevant outcome in

the future

Convergent validity Measure is more strongly

associated with measures of similar constructs

Discriminant validity

Measure is less strongly associated with measures of

dissimilar constructs

Concurrent validity Your measure is

correlated with a relevant outcome now,

in the present

TEST-RETEST RELIABILITY

People get consistent scores every time they

take the test

FACE VALIDITY

It looks like what you want to

measure

INTERNAL CONSISTENCY RELIABILITY

People give consistent scores on every item on a

questionnaire

INTERRATER RELIABILITY Two coders’ ratings of a

behavior are consistent with each other

Morling, 2012

Relationship between Reliability & Validity

Reliability is a necessary but not sufficient condition for validity

Relationship between Reliability & Validity

Concurrent & Predictive Validity

• Both evaluate whether scores on your measure are related to scores on other concrete outcomes that they should be related to

• e.g., measure of clinical skills/aptitude or graduate school aptitude Concurrent Validity

– Does your measure correlate with a relevant ‘outcome’ right now, in the present

– e.g., correlate scores on your measure of clinical skill with outcome (client ratings of therapeutic alliance; ______________)

Predictive validity – Does your measure correlate with a relevant ‘outcome’ measured in

the future – e.g., correlate scores on your measure of clinical skill assessed now,

with an outcome measured in the future (client improvement in therapy; ____________)

• Can calculate via a correlation coefficient, r

Convergent & Discriminant Validity

• Does the test show a meaningful pattern of associations with other measures • Your measure should:

• Correlate more strongly with other measures of similar constructs, and • Correlate less strongly with measures of other, different constructs

Convergent Validity – Your measure correlates more strongly with other measures of similar constructs

• e.g., Differentiation of self scores should correlate with: __________________ • ______________________________________________________________

Discriminant Validity – Your measure correlates less strongly with measures of other, different constructs

• e.g., Differentiation of self scores should NOT correlate with: ______________ • ______________________________________________________________

• Can also calculate via a correlation coefficient, r • No absolute level of correlation indicates convergent or discriminate validity

evidence…look to the pattern of findings across the nomological net

Cultural Validity “…is concerned with the construct, concurrent, and predictive validity of theories and models across cultures, i.e., cultural ly different individuals” (Leong & Brown,

1995, p. 144)

• Planning your study • Use MC theories to conceptualize the research; consult with cultural communities • Translate demographics into salient psychological characteristics (e.g., ethnic identity

development, experience of micro-aggressions)

• Selection of measures • Use multiple measures to represent each construct • Pilot test measures with your target population • Use culturally congruent measures in your study • Create or adapt ethnocentric measures

• Recruiting participants • Representative of your target population • Use procedures congruent for this cultural group • Recruit to represent underlying psychological characteristics of interest

• Analyzing your data • Evaluate cultural hypotheses & rival, competing hypotheses • Examine moderator effects of cultural variables

• Interpreting results • Design your study to benefit participants directly • Represent participants’ voices authentically when interpret data • Integrate service into community as way of ‘giving back’ • Engage participants in interpretation of data and share findings 23

Using Factorial Designs to Study External Validity

• Factorial designs are comprised of at least two independent variables, and each IV has 2+ levels – IV-1: intervention (treatment, control group: 2 levels) – IV-2: status variable (i.e., demographic or individual difference

variable) (e.g., gender: male, female: 2 levels)

Independent Variable 1. Treatment 2. Control 1.male 2. female

• Enables us to learn whether the treatment works or works

better for one level of the status variable than another (via ‘interaction’ effects)

Gender

Recommendations for conducting culturally-valid quantitative research

– Identify demographic variables that serve as proxy variables & measure those social-psychological variables directly

• Ethnic & racial group status as a proxy for socio-economic status • Racial group status as proxy for stage of racial identity development

– Evaluate external validity of studies, not solely based on demographic

characteristics of a sample, but on salient psychology characteristics & a strong theoretical rationale

e.g., potential generalization of research on racial identity development from African-American samples to other stigmatized ethno-cultural populations

– Benefits – Conceptual generalization promotes better theory building, and – Use of social-psychological characteristics (rather than simple demographics)…

• …may limit use of inappropriate generalizations to an entire population • …would enable focus on psychological antecedents for psychological

outcomes • …could divert efforts away from token sampling of ethno-cultural groups that include only highly acculturated members of who fail to represent the important psychological characteristics of the larger population

Recommendations continued…

• Improve construct validity in measurement via evidence of cultural equivalence of tests/measures

– Linguistic equivalence: do translated items carry the same meaning in the target language as they do in their source language?

– Functional equivalence: does the phenomenon have similar functions across cultures? (e.g., assertiveness as ‘adaptive’)

– Conceptual equivalence: does the concept have an equivalent in other cultures? (e.g., defining IQ…)

– Psychometric equivalence: are the ways in which the concept is quantified equivalent across cultural groups? (e.g., timed components of IQ test…)

• Involve indigenous experts in formulating theory, study hypotheses, research procedures, & interpretation of results

• Strengthen cultural validity of your research study

Face & Content Validity (most subjective)

Face Validity

– Weakest way to try to demonstrate construct validity – To what extent does this measure appear "on its face" to

be a good translation of the construct – Is essentially a subjective judgment call

Content validity – Involves a subjective check the operationalization against

the relevant content domain for the construct. – Often involves surveying ‘expert’ in the content domain to

evaluate content capture of your measure

Concluding Notes re: Measurement of Constructs

1. A single operationalization (i.e., single scale or instrument) will almost always poorly represent a construct

2. The correlation between two constructs is attenuated (i.e., weakened) by unreliability in measurement

3. Unreliability always makes it more difficult to detect true effects (should any be present) because of reduced statistical power.

4. The correlation between two measures using the same method is inflated by (shared) method variance.

5. If possible, multiple measures using multiple methods should be used to operationalize a construct.

6. Typically, interpretations of relationship should be made at the construct level, for seldom are we interested in the measures per se. Awareness of the effects of unreliability and method variance is critical for drawing proper conclusions.

29 SAMPLING

Sampling • When we consider external validity, we ask whether results of a particular

study can be generalized, to other people in the population, or to kinds of settings we’re interested in.

• To interrogate the external validity of a frequency or causal claim, we ask for example: – Do clients who rated this therapist’s warmth adequately represent all of the

therapist’s former clients? – Can we predict the results of the presidential election from the results of this

poll taken from these 1,500 people?

• Sample: portion of the population, e.g., one potato chip • Population: all, e.g., the whole bag of chips

• You don’t need to study the whole population. You just need to insure

that the sample you study adequately reflects the population

Sampling

– Define your population of interest – Now you can assess how well your sample represents it

• Bias – Samples are bias when they are unrepresentative of the population – Biased samples lead you to draw the wrong conclusions about the

population

• e.g., your 1 potato chip is burnt (biased sample) • This would lead you to conclude something wrong about the whole bag of chips

• e.g., Presidential election poll • Biased sample would include too many of the most unusual (not typical) people

• e.g., Therapist ratings • Clients who rate their therapist on a website may tend to be ones who are angry

or disgruntled, and not represent the rest of the therapists’ clients very well

Sources of Biased Samples

• Sampling only those people who are easy to

contact

• Sampling only those who you can contact

• Sampling only those who self-select (i.e., invite themselves)

Getting a Representative Sample Probability sampling

– Draw the sample at random from that population – Every member of the population has an equal chance of being in the sample

1. Simple random sampling

1. Most basic form of prob. sampling, but difficult and time-consuming 1. Assign a number to every person in the pop. 2. Use a table of random numbers to select a sample from the pop

2. Cluster sampling 1. Start with a list of clusters and take a random sample of clusters from that list and include

every person from each of those selected clusters 2. E.g., what to randomly sample school districts in OR; start with list of districts (clusters) in

the area, and randomly select 4 of those districts (clusters) and include every child from each cluster in your sample

3. Multistage sampling 1. Similar to #2: but you select two random samples 2. Start with a list of clusters and take a random sample of clusters from that list but then take a

random sample of children rom each selected clusters

4. Stratified random sampling 1. Select particular demographic characteristics on purpose and then randomly select

individuals within each of the categories 2. e.g., in a study of self regulation development, stratify on child age to obtain at least X

number kids from age 3, age 4, and age 5 into the study

Getting a Representative Sample

Probability sampling

– Draw the sample at random from that population – Every member of the population has an equal chance

of being in the sample

4. Stratified Random Sampling cont. – Oversampling

• Is a variant of stratified random sampling • Use stratified random sampling and deliberately include

more of one group, usually when that group is difficult to engage in research or in low numbers in your population. – e.g., oversampling for physically-abused children helps to insure

there are adequate numbers of participants in the sample

Random Sampling vs.

Random Assignment

• Random sampling (i.e., probability sampling) – Get a sample using some random method so that each member of the

population of interest has equal chance of being in the sample

– Enhances ___________ validity

• Random assignment (used only in experimental designs) – Assign members of the sample at random to the groups or conditions

of the IV, for example, by flipping a coin

– Enhances ____________ validity

Non-Representative Sampling Methods

• Convenience sampling

– Samples chosen on the basis of who is easy to access

• Purposive sampling – Choosing a sample of only certain kinds of people you want to

• Snowball sampling – A variation of purposive sampling used to find rare individuals

for a research study or the sample is otherwise hard to obtain – Each participant in the study is asked to recommend a few

acquaintances to the study

Psychology

Class 7

257 HEDCO Building

541-346-0913

Three conditions for determining

causality

1. Co-variation (i.e., correlation)

2. Temporal precedence

3. Ruling out alternative explanations (due to extraneous

3rd variable threats…i.e., internal validity)

Tools for Testing “Associational”

Hypotheses

Kinds of studies that lead to associational claims

• Correlational research (i.e., ex post facto) o 2+ measured variables (regardless of the stats used) make a study correlational

o Prioritize construct validity & statistical conclusion validity, & external validity

o Avoid temptation to make causal inferences from these kinds of studies

Kinds of graphs & statistics used to describe associations

• Bivariate correlations o Positive, negative, zero, & curvilinear

o Graph association between scores on 2 variables using Scatterplot and a correlation coefficient, r

Designing and evaluating studies that make an associational claim (via the 4 big validities)

Testing “Associational” Hypotheses

• Bivariate correlations

o Positive, negative, zero, & curvilinear

o Graph association between scores on 2 variables using scatterplot

o Calculate strength of correlation coefficient, r

Testing “Associational” Hypotheses

Bivariate

correlations

Correlation coefficients (r)

Type Effect Size

Small Medium Large

r .10 .30 .50

d/g .20 .50 .80

ratio 1.50 2.50 4.25

Associations:

• between 2 continuous variables: correlation coefficient, r

• when 1 variable is categorical: t test (or a point-biserial correlation)

• when both variables are categories: phi coefficient

Ascertain strength of associations: Cohen’s conventions…

Testing Associational Hypotheses

• Designing and evaluating studies that make an

associational claim (via the 4 big validities)

• Statistical conclusion validity 1. Effect size?

2. Correlation statistically significant?

3. Are there subgroups?

• Statistical conclusion validity o Are there subgroups?

• Interpret scatterplot below…

• Consider subgroups (class standing)

8 # Absences

• Statistical conclusion validity o Are there subgroups?

• Now consider subgroups (class standing) and interpret scatterplot…

9 # Absences

Freshmen

Juniors

Seniors

• Statistical conclusion validity o Could outliers (extreme scores) be affecting the relationship between

variables?

o More likely with smaller samples

10 # Absences

• Construct validity o How well were our variables measured?

o Good Reliability?

o Does each measure what it’s intended to measure (Validity)?

• External validity o To whom can we generalize?

o To whom do we wish to generalize?

o Which population(s) did we sample from?

o What methods did we employ to sample?

o Moderating variables

• In what subgroups does the association exist?

• Goal: to learn whether the association is different within different levels of

the moderator (e.g., at low SES, moderate SES, or high SES)

causality

Establishing Temporal Precedence

• Longitudinal designs: enable us to examine evidence for

temporal precedence in the relation between our 2

variables of interest o Useful for other reasons as well

o There are many variables that we cannot manipulate, or it would be unethical to

do so (e.g., exposure to violent TV shows; smoking)

o Thus useful when experiments are not practical

• How to: o Measure same variables in same people over two+ different time points

Longitudinal designs

• Testing temporal associations between watching violent TV shows and

aggression

Violence

3rd grade

Violence

13th grade

Aggression

13th grade

Aggression

3rd grade

1. Cross-sectional correlations

2. Autocorrelations

3. Cross-lagged correlations

Longitudinal designs

(intensive repeated measures) • Temporal associations between maternal physiology & harsh parenting

Hostile

control

Hostile control

(30” later)

Physiological

arousal

(30” later)

Physiological

arousal

1. Cross-sectional correlations

2. Autocorrelations

3. Cross-lagged correlations

Longitudinal Designs

• Interrupted Time Series

• Stable Baseline Designs o Assess baseline via multiple assessments over time in an extended fashion to

establish consistent scores, then introduce the intervention/experimental

condition and continue with over time assessments post-intervention

• Multiple Baseline Designs o Introduction of intervention components is staggered across time, contexts, or

situations (e.g., 3 problem behaviors in classroom identified—introduce

intervention for each one in staggered fashion—continue to assess all behaviors)

• Reversal Designs • Best used in situations when the intervention would not cause lasting

change (i.e., to test a therapy or educational intervention)

• Some ethical concerns with ‘withdrawal’ a treatment

• Stable Baseline Designs o Assess baseline via multiple assessments over time in an extended fashion to

establish consistent scores, then introduce the intervention/experimental

condition and continue with over time assessments post-intervention

On-task

behavior

1 2 3 4 5 6 7

---------Baseline --------------- -----Post intervention--------

Intervention

• Multiple Baseline Designs o Introduction of intervention components is staggered across time, contexts, or

situations (e.g., 3 problem behaviors in classroom identified—introduce

intervention for each one in staggered fashion—continue to assess all behaviors)

• BASELINE INTERVENTION

20 SESSIONS

Poking

neighbor

Grabbing

objects

Not raising

causality

3. Ruling out alternative explanations (due to

extraneous 3rd variable threats…i.e., internal validity)

Bivariate correlations show covariance. _______

• But not temporal precedence—not sure which variable came first Solution: cross-lag panel designs (longitudinal designs)

• And not internal validity—no control for third variables Solution: multiple regression

Ruling Out Third Variables with

Multiple-Regression Designs

• Measuring more than two variables

• Regression results indicate if a third variable affects the

relationship

• Adding more predictors to a regression

• Regression does not establish causation

The Third Variable Problem

Multiple Regression Helps with the

Third Variable Problem

Adding More Predictors

Review: Are multiple regression studies able to show causation?

– Temporal precedence? (maybe not)

– Internal validity? (You can only control for variables that you thought to measure.)

Good experiments are still the best.

Multiple Regression and the Third Variable Problem

Multiple Regression Helps with the Third Variable Problem

Regression Does Not (Definitively)

Establish Causation

Getting at Causality

Start with an association between two variables:

(IV) RECESS and (DV) BEHAVIOR PROBLEMS (link C).

Mediation hypotheses propose a mechanism for a bivariate relationship. Why are these

two variables correlated? (i.e., Recess affects Physical Activity which then impacts

Behavior Problems)

Mediation hypotheses are causal statements.

Mediators specify a time sequence for the three variables (temporal precedence).

Mediators also specify the mechanism (IV affects DV through the mediator).

Mediation

1. Test path c 2 Test path a 3 Test path b 4 Regression (test path c’):

DV is behavior problems IVs are physical activity and recess Does the ‘recess – beh problems’ link (path c) get smaller when physical activity is controlled/accounted for? If YES, then physical activity is a mediator.

Steps in Testing Mediation

Mediators Versus Third Variables

Mediation Model

3rd Variable

Problem

Moderator Effect Gender

Extroversion Group

conversations

Indicate whether each statement below is describing a mediation hypothesis, a third variable argument, or a

moderator result. First, identify the key bivariate relationship. Next decide whether the extra variable comes

between the two key variables or is causing the two key variables simultaneously. Then draw a sketch of

each explanation, following the examples in Figure 9.13 in the text.

1. Having a cognitively demanding job is associated with cognitive benefits in later years, because

people who are highly educated take cognitively demanding jobs, and people who are highly

educated have better cognitive skills.

2. Having a cognitively demanding job is associated with cognitive benefits in later years, but only

among men, not among women.

3. Having a cognitively demanding job is associated with cognitive benefits in later years, because

cognitive challenges build lasting connections in the brain.

1. Viewing violent television is associated with aggressive behavior because children model what

they see on TV.

2. Viewing violent television is associated with aggressive behavior because people who watch more

violent TV have more lenient parents, and these lenient parents also do not care if their children

are violent.

3. Viewing violent television is associated with aggressive behavior very strongly among teenagers,

but less strongly among young adults.

In Class Activity #4

Psychology

Class 8

257 HEDCO Building

541-346-0913

Analyses Design selection

causality

Testing “Causal” Hypotheses

• Review basic components of ‘Experiments’

o Independent variables

• Manipulated

o Dependent variables

• Measured

Three conditions of causality…

1. Establishing covariation

2. Establishing temporal precedence

3. Establishing internal validity

Two kinds of designs…

Two kinds of designs that support causal claims

1. Independent-groups designs o (i.e., between-groups or between-persons or BP designs)

o Different groups of participants are assigned to different levels of the independent variable

2. Within-groups designs o (i.e., within-persons or WP designs)

o One group of participants are assigned to (or presented with) all levels of the independent variable

• “Enables researcher to treat each participant as his/her own control”

o Two basic forms of this design

1. Posttest only designs: random assignment and 1 posttest

R X O2a

1. Pretest-posttest designs: random assignment & key DVs are measured twice—once before and once after exposure to the IV

R O1 X O2

R O1 O2

Test for covariation by detecting

differences in the dependent variable; establish temporal precedence bec.

IV precedes changes in DV; if study is

conducted well (no design confounds, no selection effects), internal validity is

established.

Randomly

Assign

IV: group 2

IV: group 1

Measure of DV

Randomly

Assign

IV: group 2

IV: group 1

Measure of DV

All above applies plus…

Use pre-posttest design to

evaluate whether random

assignment made groups equal

(relevant with small n studies);

can better track change over

time in each group

o Two basic forms of this design

1. Posttest only designs: random assignment and 1 posttest

R X O2a

1. Pretest-posttest designs: random assignment & key DVs are measured twice—once before and once after exposure to the IV

R O1 X O2

R O1 O2

Randomly

Assign

IV: person

praise

IV: process

praise

# problems

solved

# problems

solved

Randomly

Assign

IV: person

praise

IV: process

praise

# problems

solved

# problems

solved

# problems

solved

# problems

solved

EXAMPLE

Study testing the effects of two

kinds of praise on children’s

problem-solving effort:

Process praise: ‘you must have

worked hard at these problems’

Person praise: ‘you must be

smart at these problems’

# problems solved

Process

Person

Trial 1 (pre) Trial 2(post)

Process

Person

# problems

solved

Posttest only designs

Pretest-posttest designs

o Which Design is Better…?

o It depends…..

o Posttest only design

• combines random assignment with a manipulated IV—enabling

powerful causal conclusions

o Pretest-posttest design

• Adds a pre-testing step…helps if you want to be sure that IV

levels are equivalent at pretesting (as long as the pretest doesn’t

change behavior…), and helps to more clearly map patterns of

change

o Concurrent-measures design

• Participants are exposed to all levels of an IV at roughly the same time, and a

single DV measure is taken

o e.g., Harlow’s study of attachment in baby monkeys

• Two ‘mothers’ are presented

IV: (mother type) A wire mother w/milk vs. A cloth mother w/no milk

DV: preference as measured by time spent clinging to either

o e.g., Coke v. Pepsi taste test

One group

Wire mom w/milk

Cloth mom

Clinging behavior

o Repeated-measures design

• Participants are measured on a DV more than once—after exposure to each

level of the IV

o e.g., Bick & Dozier’s (2008) study of social bonding in new mothers

• Two ‘toddlers’ are presented and mothers instructed to interact

closely with them

IV: (toddler type) own toddler vs. different toddler

DV: Oxytocin levels in bloodstream (social neuropeptide central to human bonding)

One Group Measure oxytocin Interact w/different

toddler Measure oxytocin

Interact w/own

toddler

o Advantages of Within-groups designs

• Ensures participants in (or exposed to) all levels of the IV are equivalent.

Why ____________________________?

• Gives the research study more (statistical) power to see differences across

conditions if they exist. Why ___________________? As per MAXMINCON,

when extraneous differences in demographic and other personality

variables, etc. are held constant across all levels, we can more easily detect

an effect of the IV manipulation if there is one.

• These designs require fewer participants overall

Within-groups designs

(i.e., within-persons or WP designs)

• Do within-group designs allow you to make causal

claims? • Covariation_____?

• Temporal precedence____________?

• Threats to internal validity_______________?

• Potential threat to internal validity for WP designs = if being exposed to one

condition changes how someone reacts to the other condition(s)

o Called: order effects or practice effects or carryover effects

• Solution? o Counter-balancing controls for order effects

Randomly

Assign

Measure

oxytocin

Measure

oxytocin

Interact w/own toddler

Interact w/different toddler

Interact w/own toddler

Measure

oxytocin

Measure

oxytocin

Testing Causal Hypotheses

• Designing and evaluating studies that make a causal claim (via the 4 big validities)

• Construct validity o How well were the variables measured and manipulated?

• External validity o To whom or to what can you generalize the causal claim?

• To other people…?

• To other situations…?

• Statistical conclusion validity o How well do your data support your causal conclusion?

1. Is the different statistically significant?

2. How large is the effect?

• Internal validity o Are there (plausible) alternative explanations for the outcome?

Testing Causal Hypotheses

• Designing and evaluating studies that make a causal

claim (via the 4 big validities)

• Internal validity o Are there (plausible) alternative explanations for the outcome?

o Three fundamental questions worth asking…

1. Did the design of the experiment ensure there were no design confounds? Or did some other variable accidentally covary along with the intended independent variable?

2. If the experimenters used an independent-groups design, did they control

for selection effects by using random assignment or matching?

3. If the experimenters used a within-groups design, did they control for

order effects by counterbalancing?

Threats to Internal Validity that can apply to

an experiment • Many threats to validity of studies can be corrected for simply by adding

a comparison group.

• A few threats may apply to any intervention study/experiment

1. Observer bias

• Possible in any study with behavioral/observed DVs o Occurs when researchers’ expectations influence their ratings/scores/interpretation of the results

• Threatens internal validity (an alternative explanation now exists…) and construct validity (ratings/scores don’t represent ‘true’ scores)

• Solution: ensure staff who measure the DV are blind to study hypotheses

2. Demand characteristics

• A problem when participants guess what the study is supposed to be about & change their behavior in the expected direction

• Solution: conduct a double-blind study, where neither staff nor participants know which condition they are in; at minimum, ensure staff are blind to condition

3. Placebo effects

• Occur when participants improve after treatment, but only because they believe they received an effective intervention

In-class activity #5: Article review Prinz et al., 2009

• Research question o Specific hypotheses

o IV = __________________; # levels of the IV = _______________

o Levels of the IV are:

o DVs: # of DVs = ______; Specific DVs are:_______________, ___________, and ____________________

o Design: ___________________________

• Diagram the design

• Describe the random assignment process. Who/what was randomized?

• Who were the participants?

• Describe the Triple P intervention condition

• Were the hypotheses supported?

• Did they acknowledge plausible threats to validity? What are some examples…?

Beth Stormshak, Ph.D.

Professor, College of Education

University of Oregon

An intervention is one thing

Implementation is something

else altogether

Implementation Science

According to NIH (2008):

The use of strategies to adopt and integrate evidence-based health interventions and change practice patterns within and across specific systems

Action Oriented

Within Settings or Systems

AND collects data

Chambers DA. Advancing the science of implementation: A workshop summary. Administration and Policy in Mental Health and Mental Health Services Research. 2008;35(1-2):3-10.

1. We know a lot about what works

10K reviewed studies in What Works Clearinghouse

2. We are short on implementation action strategies to put what works into practice:

3. It takes too long for research to affect practice

T1 – Type 1 – The application of basic research findings to the development of interventions

T2 – Type 2 – Investigates the process and mechanism through which tested and proven interventions are integrated into practice and policy

T1 research is more common, T2 research is more limited

The use of effective interventions without implementation strategies is like serum without a syringe; the cure is available, but the delivery system is not

Fixsen, Blase, Duda, Naoom, Van Dyke,2010.

Only a small percentage of interventions implemented by community based delivery systems are evidence based.

Effective

Interventions

Actual Supports

Years 1-3

Outcomes

Years 4-5

Every Teacher

Trained

Fewer than 50% of

the teachers

received some

training

Fewer than 10% of

the schools used the

CSR as intended

Every Teacher

Continually

Supported

Fewer than 25% of

those teachers

received support

Vast majority of

students did

not benefit

Aladjem & Borman, 2006; Vernez, Karam, Mariano, & DeMartini, 2006

Longitudinal Studies of a Variety of Comprehensive School Reforms

“17 Year Gap” in Health Care

Is the gap between research and practice similar in education to that existing in health?

Types of Gaps?

As long as?

As important to shorten? Which way?

As resistant to change?

Making a Program

Does a Program Work?

Could a Program Work?

IOM 2009

Landsverk,

Brown et al.

Aarons et al.,

Implementation

Exploration

Adoption / Preparation

Implementation

Sustainment

Effectiveness Studies

EfficacyStudies

Preintervention

Traditional Translation Pipeline

knowledge

Generalized

knowledge

Intervention

Intervention: Program, Practice, Policy, Principles

Practice Setting: Delivery Support System

Ecological System: Population and Community/Cultural Context

Preadoption◦ How do preferences for EBI impact consumer choices?

◦ What are the key channels for stakeholders to obtain EBI information?

Adoption

• What are key market, organizational, and other factors influencing adoption decisions?

• What evidence is used by decision makers in the adoption phase?

Implementation◦ What are the most effective delivery systems for different

settings?

◦ What influences consumer participation?

◦ What are the factors that impact implementation quality?

Sustainability

• What funding models are needed to sustain the program?

• What are the effective leadership strategies for long-term implementation?

The Baltimore City Public School System (BCPSS) has

collaborated in 3 generations of education and

prevention field trials.

Trials were directed at helping children master

obeying rules of behaving, attending, academic

learning, socializing appropriately in 1st grade

classroom.

Interventions were tested separately in 1st generation

(our focus today), then together in later trials.

Universal

Selective

Indicated

RxMed, MH,Soc Welfare

Levels of Prevention and Treatment

Early Risk in Prevention Research

Over the last four decades much has been learned about early

risk factors and paths leading to drug abuse, and other behavioral, mental health, and school problems.

Aggressive, disruptive behavior as early as 1st grade has been repeatedly found a risk factor for later drug and alcohol abuse and disorders, delinquency, violence, tobacco use, high risk sex, school failure and other high risk behaviors.

Parenting interventions are one of the most effective for reducing aggressive behavior over time.

You have decided to implement your intervention in schools

What are the barriers?

What are the strengths?

How will you go about doing this?

Do you think you will be successful?

Test and Tailor

for Real

World Conditions

Developmental

& Measurement

Models

Intervention

Design &

Experiment

Revise for

Public Health

Service Settings

Improved:Effectiveness,

Efficiency, Expense

Ethics

InitialInterview

AssessChild &Family

ParentFeedback &

Planning

Brief, tailoredPMT

PMT Treatment

ChildCBT

CommunityTreatmentResources

An Overview of the Family Check-Up and Follow-Up Services

The Family Check-Up

Mindful

Parenting

(proactive,

Monitoring)

Positive

Behavior

Support

Setting

Healthy

Limits

Family

Relationship

Building

Project Alliance 1 Portland Public Schools, 1995-present

Project Alliance 2 Portland Public Schools, 2005-2010

Early Steps Children involved in WIC, ages 2-10

Shadow Project American Indian families in PNW

Community Mental CMH agencies in Portland–120 familiesHealth (CDC)

Positive Family 44 Oregon Middle SchoolsSupport

Positive Family 5 Oregon Elementary schoolsSupport: Elementary school

Service Systems Affecting Mental Health

of Children and Adolescents

Developmental

Childhood

Adolescence

Public School

Setting

Community

Programs:

Treatment and

Rehabilitation

Preschools

OutcomeDomain

InterventionEffects

Period of Development

Authors

Behavioral * Problem behavior* Problem behavior

Age 2 to 4Age 2 to 7.5

Shaw et al 2006Dishion et al 2013

Affective * Co-morbid depression* Maternal depression

Age 2 to 4Age 2 to 4

Connell et al, 2009Shaw et al, 2009

Parenting * Observed PBS* Reduced coercion

Ages 2 to 3Ages 2 to 4

Dishion et al, 2008Smith et al, 2013

Cognitive/Educational

*Improved effortful control and language

*School readiness

Ages 2 to 7

Chang et al, in press

Brennan et al, 2013

Effects of the Early Childhood Family Check-up:

Average 2 Annual Sessions 70% Engagement

OutcomeDomain

InterventionEffects

Period of Development

Authors

Behavioral * Antisocial Behavior*Early Drug Use*Drug (ab)use*Problem behavior*High risk sex

Age 11 to 19Age 11 to 14Age 11 to 23Age 11 to 14Age 11 to 22

Van Ryzin et al, 2012Dishion et al 2002Veronneau et al in pressStormshak et al, 2010Caruthers et al 2013

Affective *Depression*Depression

Age 11 to 15Age 11 to 14

Connell et al, 2006Fosco et al, in press

Parenting * Observed Monitoring* Reduced conflict

Ages 11 to 14Ages 11 to 16

Dishion et al, 2003Van Ryzin et al, 2012

Cognitive/Educational

*Improved gradesand attendance

Ages 11 to 17 Stormshak et al 2010

Effects of the School-based Family Check-up:

Average 6 Sessions over 2 years and 25-50% Engagement

Phase 1Exploration and

Readiness:

1) Information/brochure, cost structure.

2) Assessment process and review

3) Plan and scope

Phase 2 Installation:

1) Role definition2) Priority and

staging3) Work site

training4) Technology

Transfer5) Supervision

training

Phase 3: Implementation

consultation:

1) Ongoing COACH supervision

2) Feedback monitoring

3) Clinical outcome monitoring

Phase 4:Sustainability:

1) Certification of therapists

2) Certification of supervisors

3) Certification of agency

4) Plan for fidelity Monitoring

Funding for this research supported by the

Department of Education IES, grant

R324A090111

Awarded to John Seeley, Ph.D., Tom Dishion,

Ph.D., Beth Stormshak, Ph.D., & Keith

Smolkowski, Ph.D.

Increased problem behavior

Increased peer group influence

Decreased attendance

Decreased parent involvement

Decreased academic performance

Robust evidence linking parenting practices and family engagement in school to positive outcomes for adolescents and young adults Biglan et al., 2004; Dishion, et al., 1996, 2002; Fosco, et al., 2013; Henderson & Berla, 1994; Henderson & Mapp, 2002

According to public health perspective:◦ Effective interventions should reach large numbers of people

Biglan, 1995; Biglan, Sprauge, & Moore 2006

◦ Interventions should be designed to fit in or alter existing service-delivery systems Hoagwood & Koretz, 1996

Schools are the largest, and often only, providers of child behavioral health services for many communities Burns, et al., 1995; Hoagwood, et al., 2001, 2003

A school-based system to form effective partnerships with parents to support student success

What it is: Strengths-based program

Integrated into PBIS tiers

Focused on family-school partnerships

Proactive

Inform, Invite, Involve parents in response to student needs

Foundation in empirically-supported strategies

Indicated

Selected

Universal

•Family Check-Up •Parenting Support Sessions•Parent Management Training

•Community Referrals

•Parent Integration CICO•Attendance & Homework Support•Home-School Beh Change Plans

•Email and Text messages

•Family Resource Center•Parenting Materials

(Brochures/Videos/Handouts)•Positive Family Outreach•Proactive Parent Screening

•Individualized Supports•Functional Behavioral

Assessments

•Specialized Supports•Check-In/Check-Out

• School Rules & Expectations

•Positive Reinforcement•Student Needs Screening

Assist middle school staff as they implement Positive Family Support within their existing Positive Behavioral Interventions and Supports infrastructure.

Brochures, TV/DVD, Supplies Meeting Table, Computer, Coffee/Danishes on counter

Invite Parents to Join CI/CO

Use Home Incentives Plan

Check-In/ Check-Out

For teachers & family resource specialists

For parents and students (with teacher & family

resource specialist help)

For teachers and parents

Parent

Readiness

Screener

(school entry)

Teacher &

Readiness

Screener

(fall-spring)

Family

Check Up

School-

Parent

PBS plan

Tailored

Student &

Family

Support

Tier I Family Support: Parent Student Readiness Screener

A unidimensional, psychometrically sound parent screener

Linked with proximal attributes of student functioning (e.g., completes homework and assignments on-time, shows up on-time to school)

Moore et al. (2014)

InitialInterview

AssessChild &Family

ParentFeedback &

Planning

Brief, tailoredPMT

PMT Treatment

ChildCBT

CommunityTreatmentResources

An Overview of the Family Check-Up and Follow-Up Services

The Family Check-Up

Tier III Family Support: The Family Check-up

Dishion & Stormshak (2007); Dishion, Stormshak, & Kavanagh (2012)

Recruitment

◦ All middle schools in Oregon implementing PBIS

invited to participate

Strict adherence to PBIS later revised due to

recruitment difficulties

◦ Interested schools provided with personal visit to

explain project and implementation process

◦ Schools randomly assigned to intervention or wait-list

control (N=41)

Workshops

◦ Spring before implementation: All staff introduction to PFS to increase school-wide

awareness and buy-in

◦ Summer before implementation: 2-day training for core PFS staff to familiarize with goals

and develop learning community Had to be revised due to drastic budget cuts throughout

implementation

◦ Fall of implementation: All staff training to increase positive communication with

parents

Consultation

◦ Intervention schools provided two years of consultation Planned visits and requested assistance

◦ Consisted of: Modeling positive family interventions Problem solving regarding when and how to involve families in

intervention Integration of family involvement into existing school

interventions Setting up family resource center Provision of parenting resources (brochures, videos, books,

etc) Increasing positive and proactive family outreach

Family-School Wide Evaluation Tool (FamSET) Multi-method, multi-source assessment

completed by trained assessor with appropriate middle school staff member

Maintains alignment with the School-Wide Evaluation Tool (SET; Horner et al., 2004)

Example items◦ “Are parents contacted before a child’s behavior gets

out of hand?” (1 = never, 4 = always)

◦ “At this school, do you offer family-based services or educational material?” (1 = never, 4 = always)

0.0 0.5 1.0 1.5 2.0 2.5 3.0

School budget contained an allocated amount ofmoney for school-wide behavioral support (U)

Followed-up with parents about previously discussedconcerns (I)

Worked directly with parents to support positiveparenting practices (I)

Asked parents to participate in positive rewardsystems for targeted school behaviors (S)

Parents had input into school-wide policies regardingstudent discipline practices (U)

Offered family-based services or educational material(U)

Worked directly with parents to support familyinvolvement in academic issues (S)

Provided assessment-based feedback about parentingrelated to academics (S)

Defined system for regular, positive contact withfamilies (U)

Parents contacted before a child's behavior got out ofhand (U)

Number of resources available to families at school (U)

Provided questionnaire to assess parents' perspectiveson student strengths and risk factors (U)

Adapted from Brown et al., 2013

80% of schools with the highest FamSET scores were in the intervention condition

60% of schools with the lowest FamSET scores were in the control condition

Poor Implementation Adequate Implementation

Strong Implementation

4.8% 28.6% 66.7%

Universal Level

Selected LevelPoor Implementation Adequate

ImplementationStrong Implementation

4.8% 42.9% 52.4%

Indicated LevelPoor Implementation Adequate

ImplementationStrong Implementation

23.8% 71.4% 4.8%

Intervention School Universal Selected Indicated Overall

Mad. 8 8 7 7.67

CP 9 9 5 7.67

Bro. 9 8 5 7.33

Cof. 9 7 6 7.33

HD 9 7 6 7.33

Ro. 9 8 5 7.33

BC 9 7 5 7.00

RR 8 8 5 7.00

Dam. 8 6 5 6.33

AS 8 8 2 6.00

Aza. 9 5 4 6.00

CR 8 5 5 6.00

WM 7 6 5 6.00

Sha. 7 8 2 5.67

WM 6 7 4 5.67

Ast. 6 4 6 5.33

Cre. 4 6 5 5.00

Tal. 6 5 4 5.00

DC 6 5 2 4.33

Lin. 5 4 1 3.33

Pio. 2 2 2 2.00

Conditions During Implementation: National

School YearOperating Expenditure

per StudentCapital Expenditure per

Student

2008-2009 $9392 $1364

2009-2010 $9275 $990

2010-2011 $9363 $777

2011-2012 $9366 $763

2012-2013 $9364 $556

From Oregon Department of Education, 2008-2013

Principal SST SPED Counselor

Highest FamSET Scores 20% 31.6% 26.7% 8.3%

Lowest FamSET Scores 60% 66.7% 73.8% 66.7%

Note. Percent turnover from year 1 to year 2, n=10

Table 1

FAM SET Implementation Findings

Control Schools

(n = 20)

Intervention Schools

(n = 21)

Implementation Tier and Sample Items Time 1 Time 2/3a Time 1 Time 2/3a

XX%b SD

XX%b SD d

Universal Implementation (range = 0 – 22) 10.65 4.95 14.25 3.58 10.86 4.36 18.86 2.35 1.58

Does your school have a room dedicated to parent or family

services?30% 45% 23.8% 85.7% 1.85

Did your school offer parent topic nights? 35% 50% 38.1% 85.7% 1.22

Selected and Indicated Implementation (range = 0 – 22) 16.15 3.76 18.90 1.67 15.38 3.83 19.71 1.77 0.47

Offer family-based assessments for students struggling

academically or behaviorally?45% 40% 33% 76.2% 1.51

Is there consistent follow-through on family support services

discussed in team meetings?90% 95% 71.4% 95.2% 0.40

Number of Resources Available to Families (range = 0 – 11) 1.30 1.95 3.67 4.56 1.48 2.99 7.48 3.69 0.96

Is there a family support person identified at the school? 25% 35% 19% 71.4% 1.28

a Third assessment for Wave A and B schools; Second assessment for Wave C schoolsb Item level data indicate the percent of schools implementing each intervention component

School readiness assessment important component of the pre-implementation process

Implementation models rarely address the increased response cost to school staff of changing routines and expectations◦ More attention needed regarding how to reinforce school staff for

implementation efforts

Interventions more likely to be sustained when implemented at the state or district level and supported with internal funds◦ High staff turnover often prohibits embedding interventions at the

individual school level

Intervention implementation most effective when scaffoldedand supported over a number of years◦ Funding for maintenance of implementation critical

Research Methods All Slides

Documents

Research Methods —Part III. Research Methods Survey

Research presentation slides

Introduction Chapter 1 and 2 Slides From Research Methods for Business By Uma Sekaran

Strings Methods Slides Java Aplus

Optimization for Kernel Methods S. Sathiya Keerthicanberra06.mlss.cc/slides/Sathiya-Keerthi.pdf · S. Sathiya Keerthi Yahoo! Research, Burbank, CA, USA Kernel methods: •Support

PowerPoint Slides. About the PowerPoint slides These slides are provided as a resource for teachers and students using Research Methods for Sport Studies

Research Methods For Business Students Research Methods For

Research Methods - fun slides

MA Slides Lecture 5 Solutions _ Costing Methods

SOM Research Methods Cover:SOM Research Methods Cover …

Research Methods for Business Research Methods for Business MBB3724 Business Research Methods

PSY202 Lecture 11 Slides - Proposal Templates Sampling and Analyses in Mixed-Methods Research

Research Methods and Usability Guidelines for Ecommerce Web Sites Mary Czerwinski Microsoft Research Note: Many of these slides came from a Keynote address

Research proposal slides

CSC2130 : Empirical Research Methods for Software …sme/CSC2130/2012/slides/01-intro.pdfcommon methods used in software practice. ... together build evidence for a clearly stated

Actuarial Methods Slides

Research Webinar Slides - Research & Innovation

Lecture Slides by Dana B. Narter, Ph.D. Research Methods in Psychology Second Edition

Type author names here Social Research Methods Chapter 15: Quantitative data analysis Alan Bryman Slides authored by Tom Owens

RESEARCH METHODOLOGY AND IPR · Research methods include all those techniques/methods that are adopted for conducting research. Thus, research techniques or methods arethe methods