50
Towards Crowd-sourced Semantic- based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

Embed Size (px)

Citation preview

Page 1: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

Towards Crowd-sourced Semantic-based Multimodal User Interfaces

Xiaojuan MaHuawei Noah’s Ark Lab

Page 2: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

2

Tip-of-the-Tongue Phenomenon (ToT)

I am allergic to ……Oh No! What is the name of medicine???

Page 3: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

3

When Syntax-level Paraphrasing Fails …

Page 4: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

4

Semantics-level Paraphrasing does the Work

Allergy, injection, infection, fungi, extract

Penicillin extractmold

injection pill

infection

bacteria

fungiallergy

Page 5: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

5

Scalable Semantic-based Multimodal User Interface

MultimodalInformation

Concept

User

SemanticLexicon

entities

orange

fruit

food

apple

vegetable

abstraction

attributes

coldhotchill

cool

freezing

warm

torrid

actions

eat

understand

drink

consume

sucksip

drinkrecipe fruit good cold

A recipe of fruit juice that is good when drinking while cold

interface

Sense Making Effectiveness

Word Finding Efficiency

Mimic Mental Lexicon in Human Memory

Page 6: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

6

Understand Mental Lexicon

Mental Lexicon“words and other verbal symbols, their meaning and referents, about relations among them, and about rules, formulas, and algorithms for manipulating them”

(Endel Tulving, 1972)

PsychologyLinguisticsAvailability

Accessibility

AINLP

DisambiguationRetrieval

HCIHRI

Sense-makingWord-finding

Page 7: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

7

Animal

FishBirdMammal

Salmon SharkBatCow Penguin

Head

Fin

Tooth

Wing Egg

Horn

Coat

FaceHas-a Is-a

Models of Mental Lexicon I

Network Model

Relation-based• Is-a, Has-a, etc.

Category-based• Is-a relation

(Collins and Loftus, 1975)

Page 8: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

8

Models of Mental Lexicon II

Feature Model(Smith, Shoben, and Rips, 1974)

RobinBirdOstrich

Is smallWalks/RunsIs large

Has wingsHas feathers

Hops

Long legs and neck

Can Fly Orange Breast

Weak Connection Strong Connection

Defining Features

Characteristic Features

Page 9: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

9

Models of Mental Lexicon III

Associative Model(Raaijmakers and Schiffrin, 1981)

Black

Fish

Bird

Salmon

Robin

Penguin

FlyPink

Red

Wing

Feather

Orange

Swim

Page 10: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

10

Google Knowledge Graph

Page 11: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

11

Microsoft MSRA Probase

(http://research.microsoft.com/en-us/projects/probase/)

Markets

European Markets Emerging Markets

Developing Countries

Newly Industrialized Countries

China

India

sim. = 0.84

Area = 9,596,961 sq kmPopulation = 1.3 billionGDP = $8.7 trillion

Area = 3,287,263 sq kmPopulation = 1.1 billionGDP = $3.57 trillion

Page 12: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

12

Princeton WordNet & Stanford ImageNet

(http://wordnet.princeton.edu/)(http://www.image-net.org/)

Page 13: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

13

MIT ConceptNet

(http://conceptnet5.media.mit.edu/)

Page 14: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

14

Semantic-based UI: WordNet + Evocation

Animal

FishBirdMammal

Salmon SharkBatCow Penguin

Head

Fin

Tooth

Wing Egg

Horn

Coat

FaceHas-a Is-a

BlackFly PinkMeatBrown SwimDangerousEvocationWordNet® is a large lexical database of English in which concepts are interlinked by means of conceptual-semantic and lexical relations. (http://wordnet.princeton.edu/)

Evocation is a bi-directional, weighted, across-parts-of-speech semantic association / relatedness measure of how much one concept brings to mind another. (Boyd-Graber et al., 2006; Nikolova et al., 2009; Ma et al., 2013)

Page 15: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

15

Why WordNet?

Page 16: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

16

Why Evocation: Spreading Activation Theory

Prepared by Evocation

Faster word finding

Easier sense making

, activationy xy x yx

a f a c

, if node connects to nodekxj k j

jj

sf l

s

, strengthbi

i

s t

(Collins and Loftus, 1975)

Page 17: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

17

Outline

• Introduction– Goal

• Effective and efficient semantic user interfaces• Sense making and word finding in scale

– Proposed Approach• Semantic network augmented with associative links• Theoretic foundation: Spreading Activation Theory

• Methodology• Evaluation• Conclusion and Future Directions

Page 18: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

18

Outline

• Introduction• Methodology

– Crowdsourcing enhanced with ML and NLP– Method 1: open response-based crowdsourcing– Method 2: rating-based crowdsourcing

• Evaluation• Conclusion and Future Directions

Page 20: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

20

Crowdsourcing

Page 21: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

21

Method I: Open Response-based Crowdsourcing

Free Association Norms Evocation

Doctor

NursePhD

HospIll

Flu

Clinic

ER

App Health

Medical

Pill

Pain

Care

Serious

Stimulus Word

First Response Word in Mind

6000+

(5000+) (75,000+)

Doctor (n.):A licensed medical practitioner

Nurse (n.):One skilled in caring the sick

Hospital (n.):A health facility where patients receive treatment

Sick (adj.):Affected by an impairment of normal physical or mental function

Diagnose (v.):Determine or distinguish the nature of an illness

0.8

0.6

0.5

0.2

(Ma, Language Resources and Evaluation, 2013)

http://w3.usf.edu/FreeAssociation/Intro.html

Page 22: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

22

Method I: Open Response-based Crowdsourcing

Free Association Norms Evocation

Disambiguate word senses

Cluster response words

Doctor (n.):A licensed medical practitioner

(75,000+)

Nurse (n.):One skilled in caring the sick

Hospital (n.):A health facility where patients receive treatment

Sick (adj.):Affected by an impairment of normal physical or mental function

Diagnose (v.):Determine or distinguish the nature of an illness

0.8

0.6

0.5

0.2

Doctor

NurseSick

HospIll

Flu

Clinic

ER

App

Health

Medical

Pill

Pain

Care

Serious

Step3

Step2

Step1

Assign evocation strength

Page 23: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

23

Step I: Cluster Response Words

Page 24: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

24

Step 2: Disambiguate Word Senses

WordNet Path-based

“path”

“wup”

“lch”

WordNet Gloss-based

“lesk”

“vector”

“vector_pairs”

Corpora-based

“res”

“lin”

“jcn”

Algorithms of Semantic Association Measures

Voting-based Process

Page 25: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

25

Step 2: Disambiguate Word Pairs

Simple Voting

Weighted Voting

, ,1,...,

,

1, ( ) max ( ( ))( )

0,

j jj

j

k x w k i wi N

k x w

if score s score svote s

else

, ,1

( ) ( )j j

K

i w k i wk

voteCount s vote s

,

,

, ,( )

,

( ) max ( ( ))j j

j ji w wj j

w x w

x w i ws candidates s

s s if

voteCount s voteCount s

, , ,1,...,

( ) ( ) / max ( ( ))j j j

jk x w k x w k i w

i Nweight s score s score s

, ,

1

( ) ( )j j

K

i w k i wk

weightedVote s weight s

,

,

, ,( )

,

( ) max ( ( ))j j

j ji w wj j

w x w

x w i ws candidates s

s s if

weightedVote s weightedVote s

,

, ,1,...,

,

( ) max ( ( ))j j

j jj

w x w

x w i wi N

s s if

weightedVote s weightedVote s

(voting among candidates)

(voting among all senses)

Page 26: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

26

Step 3: Assign Evocation Strength

forward strength = % of agreement Avg. = 5.73%, SD. = 9.37%

very strong (immediate)strongmoderate

weak

Page 27: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

27

Method I: Open Response-based Crowdsourcing

From Free Association Norms to Evocation

Page 28: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

28

Method II: Rating-based Crowdsourcing

(Nikolova et al., ASSETS2009)

2990 Amazon Mechanical Turkers

10,000 pairs of concepts

$0.07 per 50 pairs

10 days

Page 29: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

29

Improved Crowdsourcing Evocation Rating

41,604 pairs of concepts with method (a)

60,000 pairs of concepts with method (b)

Page 30: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

30

Comparing Two Crowd-sourced Evocation Datasets

Page 31: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

31

Extension of Evocation via Boosting Algorithm

(BoosTexter, Schapire and Singer, 2000)

WordNet-based Features Corpora/Context-based Features

“path” – shortest path“jcn” – Jiang & Contrath “Lesk” - Banerjee & Pedersen“hso” – Hirst & St. Onge “lch” – Leacock & Chodorow “pos” – Part of Speech

Relative EntropyMean

VarianceL1 DistanceL2 DistanceCorrelation

Contextual OverlapLSA-vectors Cosine

Frequency

Page 32: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

32

Outline

• Introduction• Methodology

– Crowdsourcing enhanced with ML and NLP– Method 1: open response-based crowdsourcing– Method 2: rating-based crowdsourcing

• Evaluation• Conclusion and Future Directions

Page 33: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

33

Outline

• Introduction• Methodology• Evaluation

– Evaluation 1: sense making effectiveness– Evaluation 2: word finding efficiency– Extension with other crowd-sourced human data

• Conclusion and Future Directions

Page 34: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

34

Evaluation I: Sense Making Effectiveness

(Ma et al., MM2009)

Page 35: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

35

Evaluation I: Crowd-sourced Image/Sound Datasets

Peekaboom Game Dataset

(Von Ahn et al., 2006)3,086 images

About 18,500 labels

SoundNet Dataset

(Ma et al., 2010)327 environmental sound clips

About 8,000 labels

3000 Amazon Mechanical Turkers, 100 people / sound

Page 36: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

36

Evaluation I: Word Sense Disambiguation

Dictionary Definition:

1. Cow -- (a fully grown female animal of a domesticated breed of ox, kept to produce milk or beef)

2. Cow -- (a large unpleasant woman)

Labels:SkyGrassCowGreen-----------

CowMoo

+WSD

Image Similarity

Audio Similarity

Label Similarity Machine

Learning

Cow mooing

Cows on the grass Milking a cow Cattle returning in dusk

Page 37: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

37

Evocation Differs from Existing Semantic Relatedness Measures

Page 38: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

38

Evaluation II: Word Finding Efficiency

(Nikolova, Ma, Tremaine & Cook, IUI2010)

Page 39: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

39

Evaluation II: Task and Interfaces

Crowd-sourced evocation

Is-a in WordNet

Page 40: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

40

Evaluation II: Participants – People with Aphasia

20 stroke survivors with language impairment

Page 41: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

41

Evaluation II: Word Finding Efficiency

Task Completion Time (min.)

Page 42: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

42

Outline

• Introduction• Methodology• Evaluation

– Evaluation 1: sense making effectiveness– Evaluation 2: word finding efficiency– Extension with other crowd-sourced human data

• Attention (online eye-tracking)• Emotion (food messaging)• Intention (trip planning)

• Conclusion and Future Directions

Page 43: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

43

(Cheng, Sun, Ma, Forlizzi, Hudson & Dey, CSCW2015)

Attention: Social Eye Tracking

Page 44: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

44

Emotion: Food Messaging

(Wei, Ma & Zhao, CHI2014)

Page 45: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

45

Intention: Trip Planning

(Chen, Zhang, Guo, Ma, et al., IEEE Transactions on Intelligent Transportation Systems)

Page 46: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

46

Outline

• Introduction• Methodology• Evaluation

– Evaluation 1: sense making effectiveness– Evaluation 2: word finding efficiency– Extension with other crowd-sourced human data

• Attention (online eye-tracking)• Emotion (food messaging)• Intention (trip planning)

• Conclusion and Future Directions

Page 47: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

47

Outline

• Introduction• Methodology• Evaluation• Conclusion and Future Directions

Page 48: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

48

Animal

FishBirdMammal

Salmon SharkBatCow Penguin

Head

Fin

Tooth

Wing Egg

Horn

Coat

FaceHas-a Is-a

BlackFly PinkMeatBrown SwimDangerousEvocation

Crowd-sourced Semantic-based Multimodal User Interface

MultimodalInformation

Concept

User

SemanticLexicon

drinkrecipe fruit good cold

A recipe of fruit juice that is good when drinking while cold

interface

Sense Making Effective

Word Finding Efficient

Scalable and Economical

Page 49: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

49

Future Directions

• Task and data-oriented crowdsourcing – heterogeneous big data– complex tasks– multi-threads processes

• Human-centric, context-aware crowdsourcing

• Human-machine hybrid computing / learning

Page 50: Towards Crowd-sourced Semantic-based Multimodal User Interfaces Xiaojuan Ma Huawei Noah’s Ark Lab

Thank you!

Xiaojuan [email protected]://www.cs.princeton.edu/~xm