39
http://lora-aroyo.org http://slideshare.net/laroyo @laroyo CrowdTruth Human-assisted computing for understanding semantic interpretation & user-centric relevance Lora Aroyo

CrowdTruth for User-Centric Relevance

Embed Size (px)

DESCRIPTION

Presented at Mini-workshop on Multiple Dimensions of Relevance: https://www.facebook.com/events/293331494186432/

Citation preview

Page 1: CrowdTruth for User-Centric Relevance

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

CrowdTruth Human-assisted computing for understanding

semantic interpretation & user-centric relevance

Lora Aroyo

Page 2: CrowdTruth for User-Centric Relevance

“The  Gallery  of  Cornelis  van  der  Geest”  (Willem  van  Haecht)  

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 3: CrowdTruth for User-Centric Relevance

hun<ng  vs.  dogs    

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 4: CrowdTruth for User-Centric Relevance

events  vs.  people  

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 5: CrowdTruth for User-Centric Relevance

DIGITAL  HERMENEUTICS  

theory of interpretation: relation parts of wholes events as context for interpretation of online collections

intersection of hermeneutics & Web technology

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 6: CrowdTruth for User-Centric Relevance

Linking to Events

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 7: CrowdTruth for User-Centric Relevance

Generating Events

Narrative

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 8: CrowdTruth for User-Centric Relevance

So  far  so  good,  but  ….  

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

L.  Aroyo,  C.  Welty:  Truth  is  a  Lie:  7  Myths  about  Human  Annota;on,  AI  Magazine  2014  (in  press).  

Page 9: CrowdTruth for User-Centric Relevance

Events  are  Vague  people have no clear notion of what events are  

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 10: CrowdTruth for User-Centric Relevance

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Events  have  Perspec+ves  and people don’t always agree  

Page 11: CrowdTruth for User-Centric Relevance

“A planned public or social get together or occasion.”  

“an event is an incident that's very important or monumental”  

“An event is something occurring at a specific time and/or date to celebrate or recognize a particular occurrence.”  

“a location where something like a function is held. you could tell if something is an event if there people gathering for a purpose.”  

“Event can refer to many things such as: An observable occurrence, phenomenon or an extraordinary occurrence.”  

If you ask the crowd ... http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 12: CrowdTruth for User-Centric Relevance

“an  event  is  the  exemplifica;on  of  a  property  by  a  substance  at  a  given  ;me” Jaegwon  Kim,  1966  “events  are  changes  that  physical  objects  undergo”  Lawrence  Lombard,  1981  

“events  are  proper;es  of  spa;otemporal  regions”,  David  Lewis,  1986  

under30ceo.com   http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

If you ask the experts ...

Page 13: CrowdTruth for User-Centric Relevance

under30ceo.com   http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

People are the ones who search & determine relevance

Page 14: CrowdTruth for User-Centric Relevance

Experts  vs.  Crowd?   Medical Relation Extraction Task (Aroyo, Welty 2014) •  91% of expert annotations covered by the crowd •  expert annotators agree only in 30% •  popular crowd vote covers 95% of expert agreement

Waisda? Video Tagging (Gligorov et al. 2011) •  14% tags in search logs are in professional vocab (GTAA) •  huge gap between expert & lay users’ views on what’s

important

Steve.Museum Project (Leason 2009) •  14% user tags are in expert-curated documentation

under30ceo.com   http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 15: CrowdTruth for User-Centric Relevance

CrowdTruth Based on annotator disagreement as an indication of the variation in human semantic interpretation of signs, and can indicate ambiguity, vagueness, over-

generality, etc.

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

crowdtruth.org

L. Aroyo, C. Welty: Crowd Truth: Harnessing disagreement in crowdsourcing a relation extraction gold standard. ACM WebSci 2013.

Page 16: CrowdTruth for User-Centric Relevance

CrowdTruth Framework for News Event Extraction

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

O.Inel, K.Khamkham, T.Cristea, A.Rutjes, J.van der Ploeg, L.Aroyo, R. Sips, A.Dumitrache, L.Romaszko: CrowdTruth: Machine-Human Computation Framework for Harnessing Disagreement in Gathering Annotated Data. ISWC 2014.

Page 17: CrowdTruth for User-Centric Relevance

The police came to Apple’s glass cube on Fifth Avenue on Tuesday to enforce order after activists released black balloons inside the cube to [protest] the company’s environmental policies.

The police came to Apple’s glass cube on Fifth Avenue on Tuesday [to enforce] order after activists released black balloons inside the cube to protest the company’s environmental policies.

The police came to Apple’s glass cube on Fifth Avenue on Tuesday [to enforce order] after activists released black balloons inside the cube to protest the company’s environmental policies.

The police came to Apple’s glass cube on Fifth Avenue on Tuesday to enforce order after activists [released] black [balloons] inside the cube to protest the company’s environmental policies.

The police [came]to Apple’s glass cube on Fifth Avenue on Tuesday [to enforce] order after activists released black balloons inside the cube to protest the company’s environmental policies.

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

News Event Extraction

Page 18: CrowdTruth for User-Centric Relevance

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Video Event Extraction

Following the grandeur of Baroque, Rococo art is often dismissed as frivolous and unserious, but Waldemar Januszczak disagrees. […] The first episode is about travel in the 18th century and how it impacted greatly on some of the finest art ever made. The world was getting smaller and took on new influences shown in the glorious Bavarian pilgrimage architecture, Canaletto's romantic Venice and the blossoming of exotic designs and tastes all over Europe.

Rococo: Travel, pleasure, madness

Page 19: CrowdTruth for User-Centric Relevance

Events have multiple DIMENSIONS M

icro

-task

Tem

plat

e

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 20: CrowdTruth for User-Centric Relevance

Each DIMENSION has different GRANULARITY M

icro

-task

Tem

plat

e

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 21: CrowdTruth for User-Centric Relevance

People have different POINTS OF VIEWS M

icro

-task

Tem

plat

e

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 22: CrowdTruth for User-Centric Relevance

Triangle of Reference

Sign

Reference Observer

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 23: CrowdTruth for User-Centric Relevance

Triangle of Reference to Capture Disagreement

Sentence

Annotation Task Worker

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 24: CrowdTruth for User-Centric Relevance

CrowdTruth Metrics Event Extraction

Three parts to understand human interpretations: Sentence

How good is a sentence for the event extraction task?

Workers How well does a worker understand the sentence?

Relations Is the meaning of the event type clear? How ambiguous/confusable is it?

Lora Aroyo Crowd Truth for Cognitive Computing Chris Welty

Aroyo, L., Welty, C.: (2014) The Three Sides of CrowdTruth. Journal of Human Computation

Page 25: CrowdTruth for User-Centric Relevance

Crowd Truth Metrics based on the Triangle of Reference

Three parts to understand human interpretations: Sign

How good is a sign for conveying information?

People How well does a person understand the sign?

Ontology Are the distinctions of the ontology clear? How ambiguous/confusable are they?

Lora Aroyo Crowd Truth for Cognitive Computing Chris Welty

Aroyo, L., Welty, C.: (2014) The Three Sides of CrowdTruth. Journal of Human Computation

Page 26: CrowdTruth for User-Centric Relevance

Disagreement Analytics •  sentence metrics: sentence clarity, sentence-relation score •  annotation task metrics: event clarity, type similarity, relation

ambiguity •  worker metrics:

o  worker-sentence disagreement o  worker-worker disagreement o  avg number of annotations per sentence o  valid words in explanation text o  same explanation across contributions o  “[OTHER]” + different type o  time to complete, number of sentences, etc.

Aroyo, L., Welty, C.: (2013) Measuring crowd truth for medical relation extraction. AAAI Fall Symposium on Semantics for Big Data

http://www.americanprogress.org/wp-content/uploads/2012/12/multiple_measures_onpage.jpg http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 27: CrowdTruth for User-Centric Relevance

Spam Detection

o  filter sentences on their clarity score: to avoid penalizing workers for contributing on ambiguous sentences o  bad sentences are removed = increase of accuracy on spam

detection

o  apply worker metrics to analyze worker agreement: workers who systematically disagree o  with majority (worker-sentence disagreement) o  with rest of co-workers (worker-worker disagreement) o  spammers annotations are removed = improvement of accuracy of

sentence metrics

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 28: CrowdTruth for User-Centric Relevance

Annotation Example Around 2:30 p.m., as if delivering birthday greetings, several Greenpeace demonstrators [ENTERED] the cube clutching helium-filled balloons, which were the shape and color of charcoal briquettes. Overall annotation & granularity distribution:

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 29: CrowdTruth for User-Centric Relevance

Even

t Typ

e D

isag

reem

ent

[ENTERED]

ACTION (18.2%)

MOTION (9.1%)

ARRIVING_OR_ DEPARTING (54.5%)

PURPOSE (18.2%)

Around 2:30 p.m., as if delivering birthday greetings, several Greenpeace demonstrators [ENTERED] the cube clutching helium-filled balloons, which were the shape and color of charcoal briquettes.

type type

type type

Sentence

Ontology Worker

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 30: CrowdTruth for User-Centric Relevance

Even

t Loc

atio

n D

isag

reem

ent

[ENTERED]

the cube (38.5%)

cube (38.5%)

none (23%)

NOT APPLICABLE

(100%)

OTHER (100%) type

type

COMMERCIAL (40%)

OTHER (40%)

INDUSTRIAL (20%)

type

type

type

Around 2:30 p.m., as if delivering birthday greetings, several Greenpeace demonstrators [ENTERED] the cube clutching helium-filled balloons, which were the shape and color of charcoal briquettes.

Sentence

Ontology Worker

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 31: CrowdTruth for User-Centric Relevance

[ENTERED]

Even

t Tim

e D

isag

reem

ent

Around (9.1%)

Around 2:30 p.m. (45.45%)

2:30 p.m. (45.45%)

TIMESTAMP (100%)

TIMESTAMP (100%)

type

type

TIMESTAMP (100%)

type

Around 2:30 p.m., as if delivering birthday greetings, several Greenpeace demonstrators [ENTERED] the cube clutching helium-filled balloons, which were the shape and color of charcoal briquettes.

Sentence

Ontology Worker

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 32: CrowdTruth for User-Centric Relevance

Even

t Par

ticip

ant D

isag

reem

ent

[ENTERED]

Greenpeace (15.39%)

demonstrators (15.39%)

Greenpeace demonstrators

(69.23%)

PERSON (100%)

ORGANIZATION (100%)

type

type

ORGANIZATION (77.77%)

PERSON (22.22%)

type

type

Around 2:30 p.m., as if delivering birthday greetings, several Greenpeace demonstrators [ENTERED] the cube clutching helium-filled balloons, which were the shape and color of charcoal briquettes.

Sentence

Ontology Worker

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 33: CrowdTruth for User-Centric Relevance

Comparative Annotation Distribution Event Type Distribution Time Type Distribution

The high disagreement for event type across all sentences likely indicates problems with the ontology. These event types are difficult to distinguish between. The event classes may overlap, be confusable, too vague, etc.

Sentence

Ontology Worker

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 34: CrowdTruth for User-Centric Relevance

Comparative Annotation Distribution Location Type Distribution Participant Type Distribution

Sentence

Ontology Worker

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 35: CrowdTruth for User-Centric Relevance

Challenges

●  Defining relevance, e.g. relevant or related events, entities, videos

●  Depicted vs. associated relevance, e.g. in video, in audio

●  Deal with reliability, e.g. provenance

●  Visualize quality analytics, e.g. multidimensionality

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 36: CrowdTruth for User-Centric Relevance

Events Cultural Heritage Exploration

http://dive.beeldengeluid.nl/

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 37: CrowdTruth for User-Centric Relevance

Events @ •  Agora: Historical Events in Cultural Heritage Collections

–  http://agora.cs.vu.nl/

•  Extractivism: Activist Events in Newspapers –  http://mona-project.org/

•  Semantics of History

–  http://www2.let.vu.nl/oz/cltl/semhis/

•  BiographyNet: Events Change in Perspective over Time

•  NewsReader: Multilingual Events & Storylines in Newspapers –  http://www.newsreader-project.eu/

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 38: CrowdTruth for User-Centric Relevance

Conclusions ●  Events are just one example for diversity of human

interpretations

●  Understanding crowd disagreement helps understand event semantics

●  Considering the interdependence of the different aspects of the annotations improves their quality

●  Disagreement metrics adaptable across domains - helped us to understand the vagueness and the clarity of a sentence/putative event

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Sentence

Annotation task Worker

Page 39: CrowdTruth for User-Centric Relevance

Questions?

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo