Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Empirical Research
The following content is licensed under a Creative Commons Attribution 4.0 International license (CC BY-SA 4.0) Valentin Schwind1
Image Source: https://pxhere.com/de/photo/544817
Learning Goals
▪ Understanding the purpose of empirical research
▪ Understanding the purpose of experiments in HCI
▪ Learning experimental designs
Empirical Research Valentin Schwind2
Why Empirical?
▪ To understand cause and effect
▪ “When metal is heated it expands.”
▪ “As the moon has gravitational pull, the oceans have tides.”
▪ “When the price increases, the sales godown.“
▪ “When users type on my new keyboard, their typing speed increases.“
▪ “My algorithm increases thememorability of its users.“
Valentin Schwind3Empirical Research
E. (2001). GROUPS : INTERACTION AND PERFORMANCE.
Why Empirical?
▪ To understand cause and effect
▪ “When metal is heated it expands”
▪ To make predictions
▪ “The metal in this bridge needs space to expand in hot weather”
▪ To test hypotheses
▪ “The metal of the bridge withstands extreme weather changes”
▪ To derive models
▪ 𝐿 𝑇 = 𝐿 𝑇0 exp(𝑇0𝑇𝛼 𝑇 𝑑𝑇)
Valentin Schwind4Empirical Research
E. (2001). GROUPS : INTERACTION AND PERFORMANCE.
Causation versus CorrelationExample: Storks and birthrate
Empirical Research Valentin Schwind5
Matthews, R. (2000), Storks Deliver Babies (p= 0.008). Teaching Statistics, 22: 36-38. doi:10.1111/1467-
9639.00013
Causation versus Correlation
▪ Fact: Birthrate and number of storks correlate
▪ Question: “If I want more babies can I move to an area with many storks?”
▪ Depends on the cause!
Empirical Research Valentin Schwind6
more storks more children
more children more storks
more children more storks
“Tertium Quid”
“Yes, do it!”
“No.”
“Depends.”
Valentin Schwind7Empirical Research
Hypothesis: “My new keyboard is easy to use”
Photo by Niels Henze
Controlled Experiments
▪ Participants rated the system easy to use, because
▪ they actually find the system easy to use?
▪ they want to support you in your research?
▪ they were overwhelmed by the system’s novelty?
▪ their football team won the world cup yesterday?
▪ Only determining the precise cause for our observation helps us to make any predictions about the world
▪ But a mere observation will not help to find the answer!
Empirical Research Valentin Schwind8
Controlled Experiments
▪ Controlled experiments are (probably the only reliable) means to isolate cause and effect
▪ What if there are potential two effects or if they potentially depend on each other?
▪ Can we consider multiple causes and observe multiple effects?
Empirical Research Valentin Schwind9
Cause Experiment Effect
Cause 1Experiment
Effect 1
Cause 2 Effect 2
Experimental Designs
▪ In controlled experiments, it is possible to analyze multiple factors at the same time
→ such designs are called: multifactorial designs
▪ Single or multifactorial designs: each factor must have at least two characteristics
→ such characteristics are called: levels
→ the combination of levels are called: conditions
▪ Levels can have different types:
▪ present (yes/no)
▪ categorial (dogs, cats, …)
▪ continuous (volume, length, age, …)
Empirical Research Valentin Schwind10
Experimental Variables
▪ Fixed Factors (x) → Independent variables
▪ „what we control“ (e.g. the prototype)
▪ Levels are fixed and represent the experimental interest
▪ Measures (y) → Dependent variables
▪ „what we observe“ (e.g. task completion time)
▪ The measure of experimental interest
▪ Control Variables (ε) → Covariates
▪ „what we know but don‘t control“ (e.g. handedness)
▪ Random Factors (έ) → Error variable
▪ No explicit factor between the levels (e.g. the participant)
Empirical Research Valentin Schwind11
Valentin Schwind12Empirical Research
Image from https://pxhere.com/en/photo/544817
Valentin Schwind13Empirical Research
Image from https://pxhere.com/en/photo/1001039
Valentin Schwind14Empirical Research
Image from https://pxhere.com/en/photo/544817
The Independent Variable (IV)
▪ How to manipulate one single aspect?
▪ In theory:
▪ by keeping all other factors (environment, weather, intelligence, mood, training,…) stable
▪ but people, situations, training, fatigue, etc… are never identical after performing the first level of a condition!
▪ In practice:
▪ a random sample
▪ (pseudo) randomization of conditions
▪ permutations
▪ counter-balancing using e.g. a (balanced) Latin-Square
Empirical Research Valentin Schwind15
Latin Square
▪ a Latin square is an n × n array filled with n different symbols, each occurring exactly once in each row and exactly once in each column
▪ a balanced Latin square additionally ensures that one symbol never follows another twice
Valentin Schwind16Empirical Research
Image by Schultz (2006) from wikimedia.org (CC-BY-SA-2.5) https://commons.wikimedia.org/wiki/File:Fisher-stainedglass-gonville-caius.jpg
A B D C
B C A D
C D B A
D A C B
Counter-Balancing vs. Randomization
▪ Conditions in a random order can avoid sequence effects, e.g. through training or tiredness
→ Randomness does not necessarily evens out sequence effects
▪ Conditions in a balanced Latin-square design evens out the “what-follows-what scenario” and protect the experiment against order effects
▪ But a balanced Latin Square design must be carried out by a number of participants using a multiple of the conditions
→ e.g. 50 conditions: 50, 100, 150, 200… participants
▪ Not feasible (e.g. in online surveys with a unpredictable number of participants)
→ In those cases experimental designs are (pseudo-)randomized by a computer
Empirical Research Valentin Schwind17
Within-Subjects IVs
▪ Participants are assigned to all conditions
▪ Advantages
▪ Economy
▪ Sensitiveness
▪ Cancelling out individual differences
▪ Disadvantages
▪ Carry-over effects from previous conditions
▪ Conditions must be balanced
Repeated measures designs
Empirical Research Valentin Schwind18
Between-Groups/Between-Subjects IVs
▪ Participants are assigned to one condition only
▪ Advantages
▪ Simplicity
▪ Less chance of practice or fatigue effects
▪ Useful when it is impossible for an individual to participate in all conditions (e.g. gender)
▪ Disadvantages
▪ Expense (time, effort, and number of participants)
▪ Insensitiveness to experimental manipulations
Independent measures designs
Empirical Research Valentin Schwind19
Hybrid IVs and Mixed-Designs
▪ Two types of variables:
▪ between-subjects variable(s)
▪ within-subjects variable(s)
▪ Participants are randomly assigned to each level of the between-subject variable(s)
▪ Randomized assignment
▪ All participants are exposed to each level of the within-subjects variable(s)
▪ Randomized or counter-balanced order
Empirical Research Valentin Schwind20
Multifactorial Designs
Empirical Research Valentin Schwind21
BPrototype
User Interface 1 32
A
Independent
Variables
C
Levels Conditions
= 9
Full Factorial Design
BPrototype
User Interface 1 3
C
= 6
Nested Design
Conditions: A1, A2, A3,
B1, B2, B3, C1, C2, C3
Conditions: A1, A2, B1,
B2, B3, C2, C32
A
HCI Research Methods
▪ Online surveys
▪ Quick, cheap, efficient, broad range of participants
▪ Lab studies
▪ controlled setting without interruptions
▪ In-situ
▪ natural environment
▪ VR/AR
▪ Safe, easy prototyping
▪ Are they different?
Empirical Research Valentin Schwind22
[1] A. Voit, S. Mayer, V. Schwind, and N. Henze. 2019. Online, VR, AR, Lab, and In-Situ: Comparison of Research
Methods to Evaluate Smart Artifacts. In CHI ’19. https://doi.org/10.1145/3290605.3300737
HCI Research Methods
▪ Yes, the results aredifferent!
▪ In-situ and VR showed thehighest e.g. hedonic and pragmatic quality
▪ Participants are not able toignore the experimental apparatus
▪ But which one reflects the„truth“ best?
Empirical Research Valentin Schwind23
[1] A. Voit, S. Mayer, V. Schwind, and N. Henze. 2019. Online, VR, AR, Lab, and In-Situ: Comparison of Research
Methods to Evaluate Smart Artifacts. In CHI ’19. https://doi.org/10.1145/3290605.3300737
Internal Validity
▪ Identification, documentation, and elimination of confounds
▪ High, when there are no alternative explanations for your results
▪ The variation of your dependent variable is caused by the variation of your independent variable
▪ Low, when there when experimental effects can be explained through confounds
▪ The variation of your dependent variable can by explained by the variation of confounds
▪ We aim for high internal validity
Empirical Research Valentin Schwind24
THIS MESS
▪ Testing – subjects react on the experimental setup or task
▪ History – e.g. events between two measurements
▪ Instrument – e.g. change of the measurement tool
▪ Statistical regression (toward the mean) – data outliers e.g. caused by inhomogeneous test groups
▪ Maturation – subjects’ change between two measurements
▪ Experimental mortality – subjects‘ disappear
▪ Selection – lacking randomization of the tested sample
▪ Selective interaction – sequence/order effects
Empirical Research Valentin Schwind25
External Validity
▪ The extent to which results can be generalized
▪ High, when results of the study can be transferred to thereal world
▪ e.g. does the sample represent the general population?
▪ Low when the results cannot be applied to the population or real-life situations outside of the research setting
→ ecological validity
Empirical Research Valentin Schwind26
Internal vs. External Validity
▪ Do internal and external validity contradict each other?
▪ Internal validity: You have to control all interfering variables
▪ External validity: You establish an artificial, experimental setting
▪ Theories are being tested deductively, not inductively
▪ A theory is based on the assumption of falsification
▪ Does the observation in an experiment with high internal validity contradicts the theory?
▪ If yes: irrelevant if the results are “representative”
▪ If no: the experiment supports the theory (→ the theory must be further tested)
Empirical Research Valentin Schwind27
HCI Research Methods
▪ All methods reflect „thetruth“
▪ The influence of external factors determines the ecological validity
▪ Controlled experiments are required to further determine those effects
Empirical Research Valentin Schwind28
Literature
▪ Field, Andy & Hole, Graham. (2003). How to Design and
Report Experiments.
▪ William Cochran & Gertrude Cox. (1950). Experimental
Designs.
▪ Donald Campbell & Julian Stanley. (1959). Experimental
and quasi-experimental designs for research
Empirical Research Valentin Schwind29