78
Dial E for Events Lora Aroyo Monday, September 10, 12

I-Semantics 2012 Keynote

Embed Size (px)

DESCRIPTION

This talk was given as a keynote at the I-Semantics 2012 conference, Graz, Austria

Citation preview

Page 1: I-Semantics 2012 Keynote

Dial E for

Events

Lora Aroyo

Monday, September 10, 12

Page 2: I-Semantics 2012 Keynote

Observationevents are important

events are omni-presentevents carry different points of viewin the world, e.g. news, science etc.

in our personal lives, e.g. social

networking

iSemantics2012 Lora Aroyo @laroyoFlickr: elkabong Monday, September 10, 12

Page 3: I-Semantics 2012 Keynote

Position

The human disagreement & vagueness of events are part of

the event semantics

iSemantics2012 Lora Aroyo @laroyoFlickr: elkabong Monday, September 10, 12

Page 4: I-Semantics 2012 Keynote

Objects vs. Events

events perdure = their parts exist at different time pointsobjects endure = they have all their parts at all points in time

objects are wholly present at any point in time, events unfold over time

iSemantics2012 Lora Aroyo @laroyoFlickr: vanilllaph Monday, September 10, 12

Page 5: I-Semantics 2012 Keynote

Events are importantcreate context for objects, e.g. people, locations, organizations, etc.

Lora Aroyo @laroyo iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 6: I-Semantics 2012 Keynote

Events are importantcreate meaning for objects, e.g. artifacts, pictures, videos.

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 7: I-Semantics 2012 Keynote

Events are importantlink concepts, objects, and stories.

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 8: I-Semantics 2012 Keynote

Events in the Worldevents anchor the information we

consume daily

iSemantics2012 Lora Aroyo @laroyoFlickr: craftydogma Monday, September 10, 12

Page 9: I-Semantics 2012 Keynote

Events @ Google News

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 10: I-Semantics 2012 Keynote

Events @ Google News

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 11: I-Semantics 2012 Keynote

Julian Assange’s Extradition Row

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 12: I-Semantics 2012 Keynote

Julian Assange’s Extradition Row

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 13: I-Semantics 2012 Keynote

Julian Assange’s Extradition Row

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 14: I-Semantics 2012 Keynote

Julian Assange’s Extradition Row

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 15: I-Semantics 2012 Keynote

Lance Armstrong’s Doping Fight

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 16: I-Semantics 2012 Keynote

Lance Armstrong’s Doping Fight

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 17: I-Semantics 2012 Keynote

The Arab Spring

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 18: I-Semantics 2012 Keynote

The Arab Spring

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 19: I-Semantics 2012 Keynote

Events @ Social Web

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 20: I-Semantics 2012 Keynote

Events @ Social Web

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 21: I-Semantics 2012 Keynote

Events @ Social Web

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 22: I-Semantics 2012 Keynote

Events @ Social Web

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 23: I-Semantics 2012 Keynote

Events @ Social Web

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 24: I-Semantics 2012 Keynote

Events are VagueHumans have no clear notion of what events are

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 25: I-Semantics 2012 Keynote

“event is a significant "happening" or gathering of people. I would define a "happening" as an event if the group of people gathered were united in one common goal.”

We Asked the Crowd What an EVENT isiSemantics2012 Lora Aroyo @laroyoFlickr: massimo vitali

Monday, September 10, 12

Page 26: I-Semantics 2012 Keynote

We Asked the Crowd What an EVENT is

Event is a happening, which can be scheduled or unscheduled. An earthquake or fire happens (unscheduled). A wedding or birthday party (scheduled). It is an occasion that is unusual and tends to be memorable.

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 27: I-Semantics 2012 Keynote

“An event would be any occurrence where physical action has taken place. It may be a single, momentary instance (I sneezed), or it may span a period of time (the festival ran for four hours). An event may also be made up of a number of smaller events, such as a day at school is an event, but each individual class is also an event itself. Basically an event must have a physical action over any delimited time span.”

We Asked the Crowd What an EVENT isiSemantics2012 Lora Aroyo @laroyo

Monday, September 10, 12

Page 28: I-Semantics 2012 Keynote

“A planned public or social get together or occasion.”

“an event is an incident that's very important or monumental”

“An event is something occurring at a specific time and/or date to celebrate or recognize a particular occurrence.”

“a location where something like a function is held. you could tell if something is an event if there people gathering for a purpose.”

“Event can refer to many things such as: An observable occurrence, phenomenon or an extraordinary occurrence.”

We Asked the Crowd What an EVENT isiSemantics2012 Lora Aroyo @laroyo

Monday, September 10, 12

Page 29: I-Semantics 2012 Keynote

What do Experts think an EVENT is?

“an event is the exemplification of a property by a substance at a given time” Jaegwon Kim, 1966

“events are changes that physical objects undergo” Lawrence Lombard, 1981

“events are properties of spatiotemporal regions”, David Lewis, 1986

iSemantics2012 Lora Aroyo @laroyounder30ceo.comMonday, September 10, 12

Page 30: I-Semantics 2012 Keynote

Event-centric Projectshow events can be detected & extracted from

natural language text

how those extracted events are represented for use on the semantic web

how to identify the same events in different sources

how to capture different perspectives

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 31: I-Semantics 2012 Keynote

Activists

prominent on the new web through different channels

by nature multi-perspective, biased & emotional

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 32: I-Semantics 2012 Keynote

Activists

prominent on the new web through different channels

by nature multi-perspective, biased & emotional

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 33: I-Semantics 2012 Keynote

Mapping Online Networks of Activism

“All protest events Greenpeace participated in.”

blogs, news, activists websites

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 34: I-Semantics 2012 Keynote

Mapping Online Networks of Activism

“All protest events Greenpeace participated in.”

• build visualizations

• create appropriate analytics

• answer questions of end users & social scientists

blogs, news, activists websites

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 35: I-Semantics 2012 Keynote

• What happened before/after?

• Who does what, when, and where?

• All bomb attacks in the 1950s

• In what events did Indonesia participate?

• ‘Grand narratives’

• objects (digitized artworks and artifacts)

• events (concrete particulars)

• entities (actors, locations, periods)

• narratives (organization of events)

Extracting Historical Events

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 36: I-Semantics 2012 Keynote

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 37: I-Semantics 2012 Keynote

• generate meaningful event sequences• capture the different perspectives• serve both end users & history researchers

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 38: I-Semantics 2012 Keynote

Timelines from Text

4 right, 2 wrong, 3 missing eventstwo have no explicit times & are in the

wrong orderOne involved al-Qaeda but took place in

Jordan on the Syrian border

does a fuzzy task require fuzzier metrics?

“al-Qaeda activities in Syria”

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 39: I-Semantics 2012 Keynote

Timelines from Text

4 right, 2 wrong, 3 missing eventstwo have no explicit times & are in the

wrong orderOne involved al-Qaeda but took place in

Jordan on the Syrian border

does a fuzzy task require fuzzier metrics?

“al-Qaeda activities in Syria”

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 40: I-Semantics 2012 Keynote

Machine Readingbuild event timelines from text in 2 example domains

• NFL: news articles on football;

• Ontology: 4 classes, 20 relations

• Intel: news articles on terrorist events;

• Ontology: 20 Classes, 50 relations

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 41: I-Semantics 2012 Keynote

Why is event semantics hard?

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 42: I-Semantics 2012 Keynote

According to NLP traditionGather

your source material

1Extract

events and properties

2

Analyze

find links between events

3

Visualize

statistics, timelines, etc.

4

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 43: I-Semantics 2012 Keynote

but for events we stumble Gather

your source material

1Extract

events and properties

2

Analyze

find links between events

3

Visualize

statistics, timelines, etc.

4

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 44: I-Semantics 2012 Keynote

but for events we stumble Gather

your source material

1Extract

events and properties

2

Analyze

find links between events

3

Visualize

statistics, timelines, etc.

4

experts typically:

define a problemannotate ground truth

train evaluate

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 45: I-Semantics 2012 Keynote

Closed World Dictatorship

1. domain experts define the meaning2. using limited vocabulary

3. aim for agreement

to fix the problem of high disagreement for eventsexperts enforce more tyranny - stricter rules

comparatively little annotated data for training & evaluation of eventdetection systems

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 46: I-Semantics 2012 Keynote

But the World is Open

1. events have multiple dimensions2. each dimension has levels of granularity

3. people have different views on both

all this leads to very complex semantics

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 47: I-Semantics 2012 Keynote

and our goal is ...

1. not to enforce agreement2. to capture different view points

3. to teach machines to reason in the disagreement space

iSemantics2012 Lora Aroyo @laroyoFlickr: elkabongMonday, September 10, 12

Page 48: I-Semantics 2012 Keynote

PositionArtificially restricting humans does not help machines to learn.

Machines will learn from diversity

iSemantics2012 Lora Aroyo @laroyoFlickr: elkabongMonday, September 10, 12

Page 49: I-Semantics 2012 Keynote

• Museum, libraries, archives & researchers have been dominating the views.

• Controlled vocabularies & annotation schemes were leading.

• Professionals enforced agreement among themselves.

• End-users needs & tasks are not considered.

Professional Dictatorship of the Closed World

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 50: I-Semantics 2012 Keynote

there is a tiny overlap between end-user terminology & professional annotations

the latter are typically coarse-grained & refer to entire object / topic

iSemantics2012 Lora Aroyo @laroyoFlickr: ganzelkaMonday, September 10, 12

Page 51: I-Semantics 2012 Keynote

only 1,900 tags (32,200 in total) match in vocabularies 257 in people (83 validated) 1,661 in geo (666 validated)

9,796 validated, but no match in professional vocabulary8% professional vocab

23 % lexical vocab 63% meaningful Google matches

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 52: I-Semantics 2012 Keynote

Amateur democracy of the Open World

• Once the Web opened the world of information, professional dictatorship clashed with end-users democracy.

• What professionals consider interesting, relevant or important does not match what users think of it.

• Amateurs cannot find what they were searching for.iSemantics2012 Lora Aroyo @laroyo(c) banksy

Monday, September 10, 12

Page 53: I-Semantics 2012 Keynote

people are interested in different annotation categories than the professionals

iSemantics2012 Lora Aroyo @laroyoFlickr: ganzelkaMonday, September 10, 12

Page 54: I-Semantics 2012 Keynote

Video aspects that are described by those tags:

non-visual (0)perceptual (11), e.g. color

conceptual (1,332)

Tag sample: 1,343 verified tags of 5 random video fragments

195 tags (adverbs & adjectives) couldn’t be classified

Abstract General Specific Total

Who 10 1665

17712

31%

What 73 563 12 57%

Where 0 6831

8 7%

When 4 31 6 5%

Total 7% 74% 9%

Object tags (1,313)Scene tags (30)

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 55: I-Semantics 2012 Keynote

Harnessing the Crowd

• we include the user’s opinion as first class citizens.

• this brings the need to combine all these (different) opinions into a system of opinions that makes sense.

• new solutions are needed, e.g. crowdsourcing of perspectives on events that exploit disagreement

iSemantics2012 Lora Aroyo @laroyoFlickr: AmyJanelleMonday, September 10, 12

Page 56: I-Semantics 2012 Keynote

What do People Disagree on?

are sub-events always mere parts?are “mentions” meaningful for events?

are events coreferential across documents? (e.g. perspectives, observations)

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 57: I-Semantics 2012 Keynote

the bombing targeted a housing development in Baghdad, killing 3 and injuring 13

indistinguishable by people, confusable: is bombing part of killing, or killing part of bombing?

What about targeting?

“merelogically extensional” (i.e arbitrary): container bursting into fragments as a result of explosion

some events don’t exist: an action by military forces prevented the bombing.

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 58: I-Semantics 2012 Keynote

Disagreement Framework• ontology: disagreements on the basic status of events

themselves as referents of linguistic utterances, e.g. are people events or do events exist at all.

• granularity: disagreements that result from issues of granularity, e.g. the location being a country, region, or city, the time being a day, week, month, etc.

• interpretation: disagreements that result from (non-granular) ambiguity, differences in perspective, or error in interpreting an expression, e.g. classifying a person as a terrorist/hero, ”October Revolution” took place in September.

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 59: I-Semantics 2012 Keynote

Disagreement Framework• ontology: disagreements on the basic status of events

themselves as referents of linguistic utterances, e.g. are people events or do events exist at all.

• granularity: disagreements that result from issues of granularity, e.g. the location being a country, region, or city, the time being a day, week, month, etc.

• interpretation: disagreements that result from (non-granular) ambiguity, differences in perspective, or error in interpreting an expression, e.g. classifying a person as a terrorist/hero, ”October Revolution” took place in September.

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 60: I-Semantics 2012 Keynote

Granularity Disagreement

spatial, temporal, participants

compositional, classificationaliSemantics2012 Lora Aroyo @laroyo

Monday, September 10, 12

Page 61: I-Semantics 2012 Keynote

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 62: I-Semantics 2012 Keynote

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 63: I-Semantics 2012 Keynote

Event Participants Disagreement

Prime minister Benjamin

Netanyahu

Benjamin Netanyahu

Israeli Prime minister

Cabinet

Benjamin Netanyahu’s

Cabinet

Israeli Cabinet

his Cabinet

Israeli Government

{TOLD}

50%

35%

15%

10%

15%

5%

45%

15%

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 64: I-Semantics 2012 Keynote

Temporal Disagreement

Spring 1998

March 1, 1998

March 1998

SundayPrime minister

Benjamin Netanyahu

Benjamin Netanyahu

Israeli Prime minister

{TOLD}

50%

35%

15%

25%

15%

50%

5%

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 65: I-Semantics 2012 Keynote

Spatial Disagreement

Lebanon

IsraelSouthern Lebanon

Israel's Northern Frontier

Middle East

{WILLING TO WITHDRAW}

35%

45%

10%

30%

65%

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 66: I-Semantics 2012 Keynote

Approach Principles1. tolerate, capture & exploit disagreement

2. understand the range of disagreements by creating a space of possibilities with frequencies & similarities

3. score the machine output based on where it falls in this space4. adaptable to new annotation tasks

iSemantics2012 Lora Aroyo @laroyoFlickr: auroilleMonday, September 10, 12

Page 67: I-Semantics 2012 Keynote

it  seems  to  refer  to  an  inference  or  communicated  feeling  more  than  specific  event.

does not refer to an event

a  group  of  people  did  something  specific  at  a  specific  point  in  6me.

refers to an event

the  actors  in  ques6on  (top  Israeli  officials)  performed  an  ac6on  during  a  specified  6me  (Sunday).

refers to an event

it  refers  to  what  the  israelis  did  on  sunday,  a  specific  6me.

Top  Israeli  officials  SENT  strong  new  SIGNALS  Sunday  that  Israel  wants  to  withdraw  from  southern  Lebanon,  ...

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 68: I-Semantics 2012 Keynote

Because  it  is  describing  a  historical  issue  concerning  the  resolu6on  of  1978

refers to an event

That  1978  resolu6on  calls  for  Israel's  uncondi6onal  WITHDRAWAL  from  the  self-­‐declared  security  zone  it  occupies  in  south  Lebanon,  ...

it  is  not  a  par6cular  movement  that  has  or  is  going  on  but  a  request  that  the  country  of  Israel  remove  their  forces  from  the  zone  they  occupy.

does not refer to an event

the  sentence  is  speaking  of  a  demand  for  a  withdrawal  that  had  not  yet  occurred.

does not refer to an event

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 69: I-Semantics 2012 Keynote

The Dark Side of Crowdsourcing Disagreement

• disagreement is beautiful, except when it results from spamming• crowdsourcing has to account for people that want to get paid for

not doing any work• spammers generate disagreement for the wrong reasons• most spam detection requires gold standard

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 70: I-Semantics 2012 Keynote

Spam or not?

• cut & paste from text• identical to other explanations • much shorter time than the average• low trust value of the worker• shorter than 5-6 words

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 71: I-Semantics 2012 Keynote

Spam or not?Prime  Minister  Benjamin  Netanyahu  TOLD  his  Cabinet  on  Sunday  that  Israel  was  willing  to  ...

Because  being  told  something  doesn't  seem  like  an  event.

does not refer to an event

Because  the  WAR  is  being  described  as  a  costly  event.

refers to an event

Top  Israeli  officials  sent  strong  new  signals  Sunday  that  Israel  wants  to  withdraw  from  southern  Lebanon,  where  a  costly  WAR  of  aTri6on  has  been  claiming  soldiers'  lives.

Because  Israel  WANTS  TO  WITHDRAW  from  Lebanon.

refers to an event

Top  Israeli  officials  sent  strong  new  signals  Sunday  that  Israel  WANTS  TO  WITHDRAW  from  southern  Lebanon,  ...

+ low worker trust

+ low worker trust

+ low worker trust

Because  WANTS  TO  WITHDRAW  is  an  ac6on.

refers to an event

Top  Israeli  officials  sent  strong  new  signals  Sunday  that  Israel  WANTS  TO  WITHDRAW    ...

+ short time

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 72: I-Semantics 2012 Keynote

Motivation-Verification Method

• 2-stage method:• disagreement collection + motivation• spam filtering = motivation judgement

• Additionally:• sample the motivation stage to manually

extract gold standard for stage 2

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 73: I-Semantics 2012 Keynote

Phase I:A. Collect event annotations +

motivations

Extraction of Putative Events

input: putative events

Phase I:C. Filtering spam

event annotations

input: output of A

Phase II:A. Collect event types

+ motivations

input: list of events

Phase II:B. Filtering spam

event types

input: output of A

Manual selection of Gold Questions

input: output of A

Manual selection of

Gold Questions

input: output of A

Phase III:A. Collect event

modalities + motivations

input: list of events

Phase III:B. Filtering spam event modalities

input: output of A Manual

selection of Gold Questions

input: output of A

Phase IV:A. Collect event

role fillers + motivations

Phase IV:B. Filtering spam event role fillers

input: output of A Manual

selection of Gold Questions

input: output of A

input: list of events

• a new way of measuring ground truth

• a new set of semantic features for learning in event extraction

iSemantics2012 Lora Aroyo @laroyoMonday, September 10, 12

Page 74: I-Semantics 2012 Keynote

PositionArtificially restricting humans does not help machines to learn.

Machines will learn from diversity

iSemantics2012 Lora Aroyo @laroyoFlickr: elkabongMonday, September 10, 12

Page 75: I-Semantics 2012 Keynote

Position

The human disagreement & vagueness of events are part of

the event semantics

iSemantics2012 Lora Aroyo @laroyoFlickr: elkabongMonday, September 10, 12

Page 76: I-Semantics 2012 Keynote

finally ...

end the tyranny

disagree

ment is beautifu

l

iSemantics2012 Lora Aroyo @laroyoFlickr: elkabongMonday, September 10, 12

Page 77: I-Semantics 2012 Keynote

Acknowledgements

Michiel Hildebrand

Lotte Belice Baltussen

Jacco van Ossenbruggen

Marteen Brinkerink

Johan OomenGuus Schreiber

Riste Gligorov

Marieke van Erp

Lourens van der Meij

Roxane Segers

Piek Vossen

Susan Legêne

Thomas Ploeger

Chiel van den Akker Frank de Bakker

Bibiana Armenta

Iina Hellsten

Geertje Jacobs

Geert-Jan Houben

Chris Welty

Monday, September 10, 12

Page 78: I-Semantics 2012 Keynote

Questions?

@laroyohttp://lora-aroyo.org

Monday, September 10, 12