Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

Towards Crowd-sourced Semantic-based Multimodal User Interfaces

Xiaojuan MaHuawei Noah’s Ark Lab

2

Tip-of-the-Tongue Phenomenon (ToT)

I am allergic to ……Oh No! What is the name of medicine???

3

When Syntax-level Paraphrasing Fails …

4

Semantics-level Paraphrasing does the Work

Allergy, injection, infection, fungi, extract

Penicillin extractmold

injection pill

infection

bacteria

fungiallergy

5

Scalable Semantic-based Multimodal User Interface

MultimodalInformation

Concept

User

SemanticLexicon

entities

orange

fruit

food

apple

vegetable

abstraction

attributes

coldhotchill

cool

freezing

warm

torrid

actions

eat

understand

drink

consume

sucksip

drinkrecipe fruit good cold

A recipe of fruit juice that is good when drinking while cold

interface

Sense Making Effectiveness

Word Finding Efficiency

Mimic Mental Lexicon in Human Memory

6

Understand Mental Lexicon

Mental Lexicon“words and other verbal symbols, their meaning and referents, about relations among them, and about rules, formulas, and algorithms for manipulating them”

(Endel Tulving, 1972)

PsychologyLinguisticsAvailability

Accessibility

AINLP

DisambiguationRetrieval

HCIHRI

Sense-makingWord-finding

7

Animal

FishBirdMammal

Salmon SharkBatCow Penguin

Head

Fin

Tooth

Wing Egg

Horn

Coat

FaceHas-a Is-a

Models of Mental Lexicon I

Network Model

Relation-based• Is-a, Has-a, etc.

Category-based• Is-a relation

(Collins and Loftus, 1975)

8

Models of Mental Lexicon II

Feature Model(Smith, Shoben, and Rips, 1974)

RobinBirdOstrich

Is smallWalks/RunsIs large

Has wingsHas feathers

Hops

Long legs and neck

Can Fly Orange Breast

Weak Connection Strong Connection

Defining Features

Characteristic Features

9

Models of Mental Lexicon III

Associative Model(Raaijmakers and Schiffrin, 1981)

Black

Fish

Bird

Salmon

Robin

Penguin

FlyPink

Red

Wing

Feather

Orange

Swim

10

Google Knowledge Graph

11

Microsoft MSRA Probase

(http://research.microsoft.com/en-us/projects/probase/)

Markets

European Markets Emerging Markets

Developing Countries

Newly Industrialized Countries

China

India

sim. = 0.84

Area = 9,596,961 sq kmPopulation = 1.3 billionGDP = $8.7 trillion

Area = 3,287,263 sq kmPopulation = 1.1 billionGDP = $3.57 trillion

12

Princeton WordNet & Stanford ImageNet

(http://wordnet.princeton.edu/)(http://www.image-net.org/)

13

MIT ConceptNet

(http://conceptnet5.media.mit.edu/)

14

Semantic-based UI: WordNet + Evocation

Animal

FishBirdMammal


Head

Fin

Tooth

Wing Egg

Horn

Coat

FaceHas-a Is-a

BlackFly PinkMeatBrown SwimDangerousEvocationWordNet® is a large lexical database of English in which concepts are interlinked by means of conceptual-semantic and lexical relations. (http://wordnet.princeton.edu/)

Evocation is a bi-directional, weighted, across-parts-of-speech semantic association / relatedness measure of how much one concept brings to mind another. (Boyd-Graber et al., 2006; Nikolova et al., 2009; Ma et al., 2013)

15

Why WordNet?

16

Why Evocation: Spreading Activation Theory

Prepared by Evocation

Faster word finding

Easier sense making

, activationy xy x yx

a f a c

, if node connects to nodekxj k j

jj

sf l

s

, strengthbi

i

s t

(Collins and Loftus, 1975)

17

Outline

• Introduction– Goal

• Effective and efficient semantic user interfaces• Sense making and word finding in scale

– Proposed Approach• Semantic network augmented with associative links• Theoretic foundation: Spreading Activation Theory

• Methodology• Evaluation• Conclusion and Future Directions

18

Outline

• Introduction• Methodology

– Crowdsourcing enhanced with ML and NLP– Method 1: open response-based crowdsourcing– Method 2: rating-based crowdsourcing

• Evaluation• Conclusion and Future Directions

19

Refinement

Extension

crowd

crowd

Our Approach: (Semi-)Crowdsourcing

Crowdsourcing• Cleaner data• Direct reflection

NLP and ML• Scalable• Cost efficient

http://www.google.com.hk/url?sa=i&rct=j&q=computer&source=images&cd=&cad=rja&uact=8&docid=N4d_QHBsrRqUsM&tbnid=FbpR5Vl8cUBtYM:&ved=0CAYQjRw&url=http://web.jeffersoncountytennessee.com/wcevents/eventdetail.aspx?eventid=366&ei=QCAgU9e5IIaIiQfTk4DoCw&psig=AFQjCNEqypEmbuBzooYiHwaqjGQ8aMJWww&ust=1394700706038769

http://www.google.com.hk/url?sa=i&rct=j&q=gears&source=images&cd=&cad=rja&uact=8&docid=jDeHZtmVEH_DuM&tbnid=SueLRaySS2InUM:&ved=0CAYQjRw&url=http://www.clipartof.com/gallery/clipart/wheel_gear.html&ei=yyAgU8LGM4ruiAeLiIHoCw&psig=AFQjCNEnTIGoiwWyw_ca2GrPXLonVqkqgA&ust=1394700645616364

http://www.google.com.hk/url?sa=i&rct=j&q=computer&source=images&cd=&cad=rja&uact=8&docid=N4d_QHBsrRqUsM&tbnid=FbpR5Vl8cUBtYM:&ved=0CAYQjRw&url=http://web.jeffersoncountytennessee.com/wcevents/eventdetail.aspx?eventid=366&ei=QCAgU9e5IIaIiQfTk4DoCw&psig=AFQjCNEqypEmbuBzooYiHwaqjGQ8aMJWww&ust=1394700706038769

http://www.google.com.hk/url?sa=i&rct=j&q=gears&source=images&cd=&cad=rja&uact=8&docid=jDeHZtmVEH_DuM&tbnid=SueLRaySS2InUM:&ved=0CAYQjRw&url=http://www.clipartof.com/gallery/clipart/wheel_gear.html&ei=yyAgU8LGM4ruiAeLiIHoCw&psig=AFQjCNEnTIGoiwWyw_ca2GrPXLonVqkqgA&ust=1394700645616364

20

Crowdsourcing

21

Method I: Open Response-based Crowdsourcing

Free Association Norms Evocation

Doctor

NursePhD

HospIll

Flu

Clinic

ER

App Health

Medical

Pill

Pain

Care

Serious

Stimulus Word

First Response Word in Mind

6000+

(5000+) (75,000+)

Doctor (n.):A licensed medical practitioner

Nurse (n.):One skilled in caring the sick

Hospital (n.):A health facility where patients receive treatment

Sick (adj.):Affected by an impairment of normal physical or mental function

Diagnose (v.):Determine or distinguish the nature of an illness

0.8

0.6

0.5

0.2

(Ma, Language Resources and Evaluation, 2013)

http://w3.usf.edu/FreeAssociation/Intro.html

22


Free Association Norms Evocation

Disambiguate word senses

Cluster response words

Doctor (n.):A licensed medical practitioner

(75,000+)

Nurse (n.):One skilled in caring the sick

Hospital (n.):A health facility where patients receive treatment

Sick (adj.):Affected by an impairment of normal physical or mental function

Diagnose (v.):Determine or distinguish the nature of an illness

0.8

0.6

0.5

0.2

Doctor

NurseSick

HospIll

Flu

Clinic

ER

App

Health

Medical

Pill

Pain

Care

Serious

Step3

Step2

Step1

Assign evocation strength

23

Step I: Cluster Response Words

24

Step 2: Disambiguate Word Senses

WordNet Path-based

“path”

“wup”

“lch”

WordNet Gloss-based

“lesk”

“vector”

“vector_pairs”

Corpora-based

“res”

“lin”

“jcn”

Algorithms of Semantic Association Measures

Voting-based Process

25

Step 2: Disambiguate Word Pairs

Simple Voting

Weighted Voting

, ,1,...,

,

1, ( ) max ( ( ))( )

0,

j jj

j

k x w k i wi N

k x w

if score s score svote s

else

, ,1

( ) ( )j j

K

i w k i wk

voteCount s vote s

,

,

, ,( )

,

( ) max ( ( ))j j

j ji w wj j

w x w

x w i ws candidates s

s s if

voteCount s voteCount s

, , ,1,...,

( ) ( ) / max ( ( ))j j j

jk x w k x w k i w

i Nweight s score s score s

, ,

1

( ) ( )j j

K

i w k i wk

weightedVote s weight s

,

,

, ,( )

,

( ) max ( ( ))j j

j ji w wj j

w x w

x w i ws candidates s

s s if

weightedVote s weightedVote s

,

, ,1,...,

,

( ) max ( ( ))j j

j jj

w x w

x w i wi N

s s if

weightedVote s weightedVote s

(voting among candidates)

(voting among all senses)

26

Step 3: Assign Evocation Strength

forward strength = % of agreement Avg. = 5.73%, SD. = 9.37%

very strong (immediate)strongmoderate

weak

27


From Free Association Norms to Evocation

28

Method II: Rating-based Crowdsourcing

(Nikolova et al., ASSETS2009)

2990 Amazon Mechanical Turkers

10,000 pairs of concepts

$0.07 per 50 pairs

10 days

29

Improved Crowdsourcing Evocation Rating

41,604 pairs of concepts with method (a)

60,000 pairs of concepts with method (b)

30

Comparing Two Crowd-sourced Evocation Datasets

31

Extension of Evocation via Boosting Algorithm

(BoosTexter, Schapire and Singer, 2000)

WordNet-based Features Corpora/Context-based Features

“path” – shortest path“jcn” – Jiang & Contrath “Lesk” - Banerjee & Pedersen“hso” – Hirst & St. Onge “lch” – Leacock & Chodorow “pos” – Part of Speech

Relative EntropyMean

VarianceL1 DistanceL2 DistanceCorrelation

Contextual OverlapLSA-vectors Cosine

Frequency

32

Outline

• Introduction• Methodology

– Crowdsourcing enhanced with ML and NLP– Method 1: open response-based crowdsourcing– Method 2: rating-based crowdsourcing

• Evaluation• Conclusion and Future Directions

33

Outline

• Introduction• Methodology• Evaluation

– Evaluation 1: sense making effectiveness– Evaluation 2: word finding efficiency– Extension with other crowd-sourced human data

• Conclusion and Future Directions

34

Evaluation I: Sense Making Effectiveness

(Ma et al., MM2009)

35

Evaluation I: Crowd-sourced Image/Sound Datasets

Peekaboom Game Dataset

(Von Ahn et al., 2006)3,086 images

About 18,500 labels

SoundNet Dataset

(Ma et al., 2010)327 environmental sound clips

About 8,000 labels

3000 Amazon Mechanical Turkers, 100 people / sound

36

Evaluation I: Word Sense Disambiguation

Dictionary Definition:

1. Cow -- (a fully grown female animal of a domesticated breed of ox, kept to produce milk or beef)

2. Cow -- (a large unpleasant woman)

Labels:SkyGrassCowGreen-----------

CowMoo

+WSD

Image Similarity

Audio Similarity

Label Similarity Machine

Learning

Cow mooing

Cows on the grass Milking a cow Cattle returning in dusk

37

Evocation Differs from Existing Semantic Relatedness Measures

38

Evaluation II: Word Finding Efficiency

(Nikolova, Ma, Tremaine & Cook, IUI2010)

39

Evaluation II: Task and Interfaces

Crowd-sourced evocation

Is-a in WordNet

40

Evaluation II: Participants – People with Aphasia

20 stroke survivors with language impairment

41

Evaluation II: Word Finding Efficiency

Task Completion Time (min.)

42

Outline



• Attention (online eye-tracking)• Emotion (food messaging)• Intention (trip planning)


43

(Cheng, Sun, Ma, Forlizzi, Hudson & Dey, CSCW2015)

Attention: Social Eye Tracking

44

Emotion: Food Messaging

(Wei, Ma & Zhao, CHI2014)

45

Intention: Trip Planning

(Chen, Zhang, Guo, Ma, et al., IEEE Transactions on Intelligent Transportation Systems)

46

Outline



• Attention (online eye-tracking)• Emotion (food messaging)• Intention (trip planning)


47

Outline

• Introduction• Methodology• Evaluation• Conclusion and Future Directions

48

Animal

FishBirdMammal


Head

Fin

Tooth

Wing Egg

Horn

Coat

FaceHas-a Is-a

BlackFly PinkMeatBrown SwimDangerousEvocation

Crowd-sourced Semantic-based Multimodal User Interface

MultimodalInformation

Concept

User

SemanticLexicon

drinkrecipe fruit good cold

A recipe of fruit juice that is good when drinking while cold

interface

Sense Making Effective

Word Finding Efficient

Scalable and Economical

49

Future Directions

• Task and data-oriented crowdsourcing – heterogeneous big data– complex tasks– multi-threads processes

• Human-centric, context-aware crowdsourcing

• Human-machine hybrid computing / learning

Thank you!

Xiaojuan [email protected]://www.cs.princeton.edu/~xm

Documents

Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab