Upload
nigel-tate
View
216
Download
2
Tags:
Embed Size (px)
Citation preview
Bringing the crowdsourcing revolution to research in communication disorders
Tara McAllister Byun, PhD, CCC-SLPSuzanne M. Adlof, PhD
Michelle W. Moore, PhD, CCC-SLP
2014 ASHA ConventionOrlando, Florida
Disclosure
• The individuals presenting this information are involved in recruiting individuals to complete tasks through AMT or other online platforms.• This session may focus on one specific approach, with limited
coverage of other alternative approaches.• Portions of the research were supported by funding from IES.• No other conflicts to disclose.
Crowdsourcing for CSD research
Case study 3: Obtaining ratings of sentence contexts for vocabulary instruction
Suzanne Adlof
Goals
• To develop a web based platform that provides individualized, effective vocabulary instruction to high school students • Individualized based on content of study, and pace of study• Instruction includes dictionary definitions, and real-world contexts
• To be able to teach any word in the English language• Beginning with a seed corpus of 1000 target words and >70,000 contexts• Want ≥ 20 good contexts per word
This research is supported by a grant from the Institute of Education Sciences: R305A130467 (Adlof, PI).
Which contexts are most “nutritious” for vocabulary instruction?• Initial corpus of texts randomly retrieved in mass quantities from the
Internet; the quality of retrieved contexts is highly variable. • Example contexts for target word “guile”:
• There are some people, like Nathanael, who truly have no guile. They are very transparent and open. They accept people at face value and, since they have no guile themselves, are bewildered when they are faced with wickedness and deceit in others. But, truly guileless people are rare. They are both refreshing and frustrating at the same time.
• Show me the dirtpile and I will pray that the soul can take three stowaways" confuses me. What are the three stowaways? One of them could be him - like he wants to go with her, but what are the other two? Also, why does she vanish with no guile? Why would she vanish with guile?
• guile is the program, the -c switch instructs guile to evaluate the statement after the switch (similar to the -e switch for perl). The use-modules directive will ask guile to load the slib module in the ice-9 directory. After the use-modules statement is evaluated, it will proceed to call functions available through Slib, namely require and printf.
Challenges of obtaining ratings
• Scale of task: 70,000 contexts is a lot!• Need multiple rating of each context to ensure reliability• Traditional lab setup: • 50 undergrad students rate 100 contexts per day for $8.00 each• 140 consecutive days & $56,000 to get 10 ratings of each context!
• AMT setup: • AMT workers each rate 5 contexts at a time, for 10-12 cents • Speed of acquisition depends on many factors, but primary factor is building
up a qualified worker pool
Step 1: Qualification Test
Step 1: Qualification Test
2. Building a Pool of Qualified Workers• Began posting QTs and HITs in August 2013• Also advertised on listservs to recruit a larger pool of workers
interested in language, word learning• 2317 AMT workers have taken the QT• 947 (41%) workers qualified
2. Worker Retention
• We have posted > 11,000 HITs for 947 qualified workers• (soliciting 10 ratings each for
>55,000 contexts)
• 75% of all context ratings have come from 27 “very high productivity raters” who have rated > 1000 contexts each
22%
26%
26%
11%
6%
5% 1%3%
Number HITs Completed after QT
01-23-1011-5051-150 151-500501-1000> 1000
3. Reliability and Validity Checks• 93 contexts each
rated by expert and 10 AMT raters• 176 AMT raters
represented across contexts• AMT average rating
correlates with expert rating at r=.71, p<.001
0 1 2 3 4 5 60
0.51
1.52
2.53
3.54
4.5
Average AMT Rating
Expe
rt R
ating
3. Reliability and Validity Checks• Spot checking suggests ratings are generally valid
Average AMT Rating
(SD)Context for target word “collusion”
1.5 (.53)In his discussion of this issue in the context of the fallout from California's recent attempt at electricity deregulation, Dr. Rapp notes that claims of collusion must be reconciled with the specific market facts and regulatory rules that affect suppliers' bidding behavior and capacity decisions. This is not always easy.
2.0 (.67)In his discussion of this issue in the context of the fallout from California's recent attempt at electricity deregulation, Dr. Rapp notes that claims of collusion must be reconciled with the specific market facts and regulatory rules that affect suppliers' bidding behavior and capacity decisions. This is not always easy.
3.0 (.94)We provide a collusive framework with heterogeneity among firms, investment, entry, and exit. It is a symmetric-information model in which it is hard to sustain collusion when there is an active firm that is likely to exit in the near future. Numerical analysis is used to compare a collusive to a noncollusive environment.
3.7 (.48)Some poker players think that by sharing information with their friends on Party Poker, they can gain an advantage and cheat their opponents. This is known as poker collusion, two or more players will use a chatroom, instant messages or even the telephone to tell their friends what cards they have.
5. What we learned along the way
• Importance of clear instructions• Importance of “customer service,” e.g., fast payment & good
communication• TurkOpticon reviews
• Include quality control measures• Inter-rater agreement• Premier rater training• Attention checks
6. Next Steps
• Students generate their own contexts as part of instructional program• Students rate contexts as part of instructional program• Machine learning for automated ratings• “Authentic artificial intelligence”
Questions? Interested in trying AMT?