25
Exploiting Subjective Annotations Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente http://hmi.ewi.utwente.nl

Exploiting Subjective Annotations Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente

Embed Size (px)

Citation preview

Page 1: Exploiting Subjective Annotations Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente

Exploiting Subjective Annotations

Dennis Reidsma and Rieks op den AkkerHuman Media Interaction

University of Twentehttp://hmi.ewi.utwente.nl

Page 2: Exploiting Subjective Annotations Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente

Types of content Annotation as a task of subjective

judgments?

Manifest content Pattern latent content Projective latent content

Cf. Potter and Levine-Donnerstein 1999

Page 3: Exploiting Subjective Annotations Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente

Projective latent content

Why annotate data as projective latent content? Because it cannot be defined

exhaustively, whereas annotators have good `mental schema’s’ for it

Because the data should be annotated in terms that fit with the understanding of `naïve users’

Page 4: Exploiting Subjective Annotations Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente

Inter-annotator agreement and projective content

Disagreements may be caused by Errors by annotators Invalid scheme (no true label exists) Different annotators having different

`truths’ in interpretation of behavior (subjectivity)

Page 5: Exploiting Subjective Annotations Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente

Subjective annotation People communicate in different

ways, and therefore, as an observer, may also judge the behavior of others differently

Page 6: Exploiting Subjective Annotations Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente

Subjective annotation

People communicate in different ways, and therefore, as an observer, may also judge the behavior of others differently

Projective content may be especially vulnerable to this problem

Page 7: Exploiting Subjective Annotations Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente

Subjective annotation

People communicate in different ways, and therefore, as an observer, may also judge the behavior of others differently

Projective content may be especially vulnerable to this problem

How to work with subjectively annotated data?

Page 8: Exploiting Subjective Annotations Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente

Subjective annotation

How to work with subjectively annotated data? Unfortunately, it leads to low levels of agreement, and therefore usually would be avoided as `unproductive material’

Page 9: Exploiting Subjective Annotations Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente

I. Predicting agreement

One way to work with subjective data is to try to find out in which contexts annotators would agree, and focus on those situations.

Result: a classifier that will not always classify all instances, but if it does, it will do so with greater accuracy

Page 10: Exploiting Subjective Annotations Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente

II. Explicitly modeling intersubjectivity A second way: model different

annotators separately, then find the cases where the models agree, and assume that those are the cases where the annotators would have agreed, too.

Result: a classifier that tells you for which instances other annotators would most probably agree with its classification

Page 11: Exploiting Subjective Annotations Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente

Advantages

Both solutions lead to `cautious classifiers’ that only render a judgment in those cases where annotators would have been expected to agree

This may carry over to users, too… Neither solution needs to have all

data multiply annotated for this

Page 12: Exploiting Subjective Annotations Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente

Time?

Page 13: Exploiting Subjective Annotations Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente

Pressing questions so far?

(The remainder of the talk will give two case studies.)

Page 14: Exploiting Subjective Annotations Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente

Case studies

I. Predicting agreement from information in other (easier) modalities: The case of contextual addressing

II. Explicitly modeling intersubjectivity in dialog markup: The case of Voting Classifiers

Page 15: Exploiting Subjective Annotations Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente

Data used: The AMI Corpus

100h of recorded meetings, annotated with dialog acts, focus of attention, gestures, addressing, decision points, and other layers

Page 16: Exploiting Subjective Annotations Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente

I. Contextual addressing

Addressing, and focus of attention. Agreement is highest for certain

FOA contexts. In those contexts, the classifier

also performed better. … more in paper

Page 17: Exploiting Subjective Annotations Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente

II. Modeling intersubjectivity

Modeling single annotators, for `yeah’ utterances

Data annotated non-overlapping, 3 annotators

All data

d s v

Trn (3585)

Tst (2289)

Trn (1753)

Tst (528)

Trn (3500)

Tst (1362)

Page 18: Exploiting Subjective Annotations Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente

II. Modeling intersubjectivity Cross annotator

training and testing

TST_d TST_s TST_v TST_all

C_d 69 64 52 63

C_s 59 68 48 57

C_v 63 57 66 63

Page 19: Exploiting Subjective Annotations Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente

II. Modeling intersubjectivity

Building a voting classifier:

Only classify an instance when all three annotator-specific expert classifiers agree

Page 20: Exploiting Subjective Annotations Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente

II. Modeling intersubjectivity

In the unanimous voting context, performance is higher due to increased precision (avg 6%)

Page 21: Exploiting Subjective Annotations Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente

Conclusions

Possible subjective aspects to annotation should be taken into account

Agreement metrics are not designed to handle this

We proposed two methods designed to cope with subjective data

Page 22: Exploiting Subjective Annotations Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente

Thank you!

Questions?

Page 23: Exploiting Subjective Annotations Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente
Page 24: Exploiting Subjective Annotations Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente
Page 25: Exploiting Subjective Annotations Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente