Upload
tavion-allman
View
218
Download
1
Embed Size (px)
Citation preview
An Introduction to Web-based Experimentation
Stian Reimers
Overview
• Why bother with web testing?• Different ways of implementing web tests• Design issues: How to get top-quality data• Ethics and good scientific practice• The future of experimenterless experiments
Why bother with web testing?
• Cheap - don’t need to pay participants• Time saving - once set up, experiment can be left to run• Thousands of participants• Wider range of people than traditional undergraduate
subject pool• Possible to target low-frequency subgroups• Reduces experimenter bias• Encourages dialogue between academic and public
Case Study: Task Switching at the BBC
• Set up in 2003, to tie in with TV series• Coded in Flash, mainly using Actionscript• Visitors to BBC S&N website click through• Data passed to our server at end of experiment
Age effects on RT
Age effects on specific and general switch costs
Summary: BBC Task Switching Experiment
• Around 50,000 participants so far• Possible to examine task switching in ages 10-66• Reported in Reimers and Maylor (2005; 2006)• Experiment has changed several times• Allows us to test new theories with minimal effort• Gets participants for further experiments
Four examples of web-based implementation
HTML with forms (and javascript)
Java
Adobe Flash
Adobe Authorware
HTML
Perl scriptSave dataGenerate dynamicresponse page
Email experimenter
HTML/Javascript: Advantages
• Quick and easy to set up• Doesn’t require long initial downloads• No plug-ins required, fewer browser compatibility issues• Can display embedded visual and audio stimuli• Familiar interface
Ideal for surveys, personality research, decision making.
HTML/Javascript: Disadvantages
• Poor at <1 sec RT measurement• Imperfect control over item display• Hard to control simulus durations• Load time between pages depends on internet traffic• Have to be online for each new page
Not good for most psychophysical experiments, experiments where effects appear in RT
rather than accuracy data, etc.
Java: Advantages
• Can be used for fairly accurate RT measurement.• Sandboxed, so can’t damage a user’s computer • Ostensibly platform independent, so one code could be
used for web-, lab-, and, say, mobile- based execution• Costs nothing.• Client-side implementation
Ideal for experiments measuring RT.
Java: Disadvantages
• Relatively difficult language to master, particularly for running on multiple platforms
• Need skill to make programs intuitive and ergonomic• Slow start-up time• Issues of different versions (Sun vs. Microsoft)• Not always installed/enabled
Not good for novice programmers, one-off experiments
• You can return to:
snipurl.com/128xy
to retry the task switching experiment
Adobe Flash: Advantages
• Similar advantages to Java– Client side processing, sandboxed, platform independent
• Designed for web implementation, so easier than Java to make good-looking experiments
• Can combine code written in Actionscript with animation-style features
• More ubiquitous plug-in
Ideal for ‘fun’ or multi-stage experiments.
Adobe Flash: RT Measurement
Reimers, S., & Stewart, N. (in press). Adobe Flash as a medium for online experimentation: A test of RT measurement capabilities. Behavior Research Methods.
Adobe Flash: Disadvantages
• Requires plug-in (but 97.3% of computers have it installed already)• Commercial software, so costs money• Easily decompilable• Awkward stimulus timing • May be blocked by advert-filtering software• Possible differences in performance across platform
Not good for very low spec machines, tachistoscopic presentation, sequences of rigidly timed stimuli.
Adobe Authorware: Advantages
• Similar advantages to Flash• And very user friendly• Quite similar to testing applications like Superlab
Ideal for experimenters who aren’t very confident programmers but want to run web experiments.
Adobe Authorware: Disadvantages
• Not cheap to buy• Requires plug-in which most people won’t have• Quite a niche product - harder to find casual programmers
to code up experiments• Relatively untested with respect to measurement accuracy,
display consistency etc
Not good for uncommitted participants or cash-strapped researchers.
Designing Web-based Experiments
Key differences
Multiple submission
Drop-out
Dishonesty
Mental state
Recruitment
Key Differences Between Web- and Lab-based Testing
• Less social pressure– May reduce demand issues– But also increases drop-out rate, lying
• Unverifiable demographics• Less control over experimental setting
– Loud music, monitor size, drunkenness
• Less control over multiple submissions
Multiple Submissions
• Historically not that big a problem (Krantz & Dalal, 2000; Musch & Reips, 2000)– But likely to be more so if participants are paid
• Ask people if they’ve taken part before• Get unique identifier (email address, NI number)• Set a cookie• Log their IP address
Dropout in most studies is a minor problem
• Sample is not representative– But still better than undergraduates
• Ideally, should log the number of participants who start the experiment relative to number who finish it.
• Gives useful info on how much people are enjoying your study
Dropout in experiments can lead to sampling biases and misleading results
Lazy People
Committed PeopleEasy Condition
Lazy People
Committed PeopleHard Condition
So, to prevent this sort of problem, use the ‘high-hurdle’ approach
Dull, irrelevant task
Easy Condition
Hard Condition
And generally, try to prevent drop-out by making things fun, easy, and interesting
• Make it fun to do and nice to look at• Implement as a game where possible• Sunk cost effect: Put the dull stuff at the end• Ask people to complete the entire test• Feedback
– Tell people about themselves– Comparisons with rest of population
• Describe the experiment’s aims and the science behind it
Dishonesty, carelessness, misunderstanding
• Not as big a problem as you might imagine– 3.5-6.3% junk /1% split-half inconsistencies (Johnson, 2005)– 1-5% inconsistency in sex differences study (Reimers, in press)– Cf. 0.7% of pencil and paper (Gough & Bradley, 1996)
• Make submission of demographic data voluntary– Or give option of ‘I’d rather not say’
• Ask the same questions at start and end– Check for consistency, but may look sneaky
• Put in equivalent of a ‘lie scale’• Obviously, remove people who aren’t responding honestly
Mental state of participants
• Can’t screen for people in abnormal mental states• Relatively small proportion of experimental
population• Remove egregious datasets at analysis stage• Ask people directly (and sensitively)• Include screening questions to show general
competence
Getting participants to do your (ergonomic, well-designed) experiment
• Get links (e.g. from department or study index site)• Advertise (banners etc)
– Costs money. Unproven effectiveness, but great potential.• Set up email list of willing participants• Pay participants
– Costs money. Multiple submissions, careless participation. Hassle to implement.
• Use a reward scheme like ipoints– Effective, can pay little, no multiple submissions,
select appropriate demographic, easy to run
Ethics of Web-based Experiments
Key differences
Informed consent
Sensitive material / personal questions
Unflattering feedback
Deception
Debrief
Key Differences between lab- and web-based research
• You are not present– Can’t offer feedback and reassurance– Can’t check a participant is in a suitable mood– Can’t tell how old a participant is– Can’t answer any questions or concerns
• Broader demographic– More lonely or socially isolated participants– More participants with mental illnesses
Informed consent
Informed consent: Pros and cons
• Follows ethical guidelines• Explains things that may otherwise have caused concern
to the participant – Dropping out is okay, data are anonymous
• Makes the experiment look more authoritative and serious• But may scare off people who’d otherwise have enjoyed
the experiment• Seems to be more of a back-covering exercise than an
attempt to ensure the participant is protected
Do I need informed consent?
Kraut R., Olson J., Banaji M., Bruckman A., Cohen J., & Couper M. (2004) Psychological research online: Report of board of scientific affairs' advisory group on the conduct of research on the Internet, American Psychologist 59, 105-117.
Sensitive material / Personal questions
• You may offend people or evoke unpleasant thoughts or memories.
• Warn people at the start of the experiment• Remind people that responding is optional• Say ‘Adults only’ or better still get people to enter their age,
and skip sensitive questions if under 18• Be sensitive in wording of questions and implications of
particular ways of framing information• Offer contact details for further information
Feedback risks making a participant feel stupid or establishing apparent norms
• Don’t tell people they’re in the bottom decile for performance on a cognitive/IQ task
‘all the women were strong, all the men were good looking and all the children were above average’
• Use broad categories for giving feedback, but better not to lie about actual performance. ‘You did better than 20%...’
• Include caveats about how poor a measure or performance your test is
• And how performance varies a lot intraindividually• And how the other participants may not be representative
Deception is not recommended online
• Always a sensitive issue• Difficult online, because debrief is harder• Need to reassure participants that they are not
being mocked or exploited when experimenter is not present
• Get ethics board input before running
Debrief
• Try to explain the aim of the experiment in simple terms– Run it past your friends and family first to make sure it’s
easily understandable
• Thank the participant for their time• Give them an email address to contact you if they
want further information or to see the final results
Sixteen standards for web-based experimenting (Reips, 2002)
Sixteen standards
• Consider a software tool for development• Pretest for clarity• Decide on HTML vs. plugins• Check for errors• Link to several sites to check self selection• Run online and offline for comparison• Use warm-up technique to avoid dropout (maybe)• Use dropout to check motivational confounding
Sixteen standards
• Minimise dropout• Highlight seriousness of experiment• Check for obvious naming of files or passwords• Avoid multiple submissions• Perform consistency checks• Keep full details for others to analyse• Report and analyse dropout curves• Keep experiment available online
The Future…
Massive longitudinal panel experiments
• Already used to running experiments with >250,000 participants
• Possible to get thousands of people from ever broader demographic to participate repeatedly
• Look at, for example, cognitive aging of individuals• Set up panels of reliable participants
– Choose demographic, etc. Cross-tabulate results from many experiments, get vast amounts of data
New devices
• Run experiments using WAP on mobile phones • If you know Java, it’s relatively easy to adapt an
application to, say, Series 60 Nokias• E.g., memory task. Participants download application.
Every hour the phone vibrates and participants see another item. Test at end of day. Send results by SMS.
• Give people a task to do at unpredictable points, check effect of time of day, mood, etc.
Conclusion
• Web-based testing can be a powerful tool for investigating issues hard to investigate in the lab
• Web-based testing has some core differences from lab-based testing
• These differences have advantages and disadvantages• In years to come there will be new ways to test people
outside the laboratory• Web-based testing is now accepted in the research
community