30
Piroska Lendvai An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors ECAI 2010 workshop on: Language Technology for Cultural Heritage, Social Sciences, and Humanities Thierry Declerck, Antonia Scheidel

An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

Embed Size (px)

Citation preview

Page 1: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

Piroska Lendvai

An Augmented Annotation Schema for Fairy Tales

Using Proppian Content Descriptors

ECAI 2010 workshop on:

Language Technology for Cultural Heritage,Social Sciences, and Humanities

Thierry Declerck, Antonia Scheidel

Page 2: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

Proppian Content Descriptors in an Augmented Annotation Schema for Fairy Tales

MotivationBackground

Projects CLARIN, D-SPIN aim to provide an integrated and interoperable research infrastructure of language resources and LT to support eHumanities (among others)

So why start with fairy tales?

• Large, high-quality corpora (Gutenberg project, Afánas'ev collection of Russian folktales, ...)

• Possibilities for comparison of fairy tales across cultures and languages

• Structure has been studied extensively

Page 3: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

Proppian Content Descriptors in an Augmented Annotation Schema for Fairy Tales

What makes a Fairy Tale?

1. The Villain

2. The Princess (and Her Father)

3. The Dispatcher

4. The Hero

5. The Donor

6. The (magical) Helper

7. The False Hero

1. The Cast: 7 Archetypes

Page 4: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

Proppian Content Descriptors in an Augmented Annotation Schema for Fairy Tales

What makes a Fairy Tale? 1. The Cast: 7 Archetypes Vladimir Propp,

1895-1970

1. The Villain

2. The Princess (and Her Father)

3. The Dispatcher

4. The Hero

5. The Donor

6. The (magical) Helper

7. The False Hero Morphology of the Folktale

Page 5: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

27

0 4 19 23 28

1 5 16 20 24 29

2 6 8 10 12 14 17 21 25 30

3 7 9 11 13 15 18 22 26 31

αInitial

Situation

δInterdict. violated

OArrival in Disguise

LFalse

Claims

MDifficult

Task

NSolution

QHero

recognized

ExImpostor exposed

TTrans-

figuration

UPunish-ment

WWedding

βAbsen-tation

γInter-

diction

εInfo.

sought

ζInfo.

obtained

ηTrickery

θFall for Trick

AVillainy /

Lack

BMediation

CCounter-

action

⬆Hero

departs

DTest

EPass Test

FMagical Helper

GGuidance

HStruggle

IVictory

KLack is

liquidated

JBranding

⬇Hero

returns

PrPursuit

RsRescue

Proppian Content Descriptors in an Augmented Annotation Schema for Fairy Tales

What makes a Fairy Tale? 2. The Story: 31 Functions

Struggle + Return

Complication

Preparation

Donors

Dénouement

Page 6: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

27

0 4 19 23 28

1 5 16 20 24 29

2 6 8 10 12 14 17 21 25 30

3 7 9 11 13 15 18 22 26 31

αInitial

Situation

δInterdict. violated

OArrival in Disguise

LFalse

Claims

MDifficult

Task

NSolution

QHero

recognized

ExImpostor exposed

TTrans-

figuration

UPunish-ment

WWedding

βAbsen-tation

γInter-

diction

εInfo.

sought

ζInfo.

obtained

ηTrickery

θFall for Trick

AVillainy /

Lack

BMediation

CCounter-

action

⬆Hero

departs

DTest

EPass Test

FMagical Helper

GGuidance

HStruggle

IVictory

KLack is

liquidated

JBranding

⬇Hero

returns

PrPursuit

RsRescue

Proppian Content Descriptors in an Augmented Annotation Schema for Fairy Tales

Example 1: Little Red Riding Hood

Scheme: αγδ [εζ]³ [ηθ]³ ABC IK ExU

The better to eat you with,

my dear!

Page 7: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

27

0 4 19 23 28

1 5 16 20 24 29

2 6 8 10 12 14 17 21 25 30

3 7 9 11 13 15 18 22 26 31

αInitial

Situation

δInterdict. violated

OArrival in Disguise

LFalse

Claims

MDifficult

Task

NSolution

QHero

recognized

ExImpostor exposed

TTrans-

figuration

UPunish-ment

WWedding

βAbsen-tation

γInter-

diction

εInfo.

sought

ζInfo.

obtained

ηTrickery

θFall for Trick

AVillainy /

Lack

BMediation

CCounter-

action

⬆Hero

departs

DTest

EPass Test

FMagical Helper

GGuidance

HStruggle

IVictory

KLack is

liquidated

JBranding

⬇Hero

returns

PrPursuit

RsRescue

OArrival in Disguise

LFalse

Claims

MDifficult

Task

NSolution

QHero

recognized

ExImpostor exposed

TTrans-

figuration

UPunish-ment

WWedding

DTest

EPass Test

FMagical Helper

GGuidance

HStruggle

IVictory

KLack is

liquidated

JBranding

⬇Hero

returns

PrPursuit

RsRescue

αγβδ ABC↑ [D¬E¬F]³ G DEF HK↓ [PrDEF = Rs]³

Proppian Content Descriptors in an Augmented Annotation Schema for Fairy Tales

Example 2: The Magic Swan-Geese

Page 8: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

27

0 4 19 23 28

1 5 16 20 24 29

2 6 8 10 12 14 17 21 25 30

3 7 9 11 13 15 18 22 26 31

αInitial

Situation

δInterdict. violated

OArrival in Disguise

LFalse

Claims

MDifficult

Task

NSolution

QHero

recognized

ExImpostor exposed

TTrans-

figuration

UPunish-ment

WWedding

βAbsen-tation

γInter-

diction

εInfo.

sought

ζInfo.

obtained

ηTrickery

θFall for Trick

AVillainy /

Lack

BMediation

CCounter-

action

⬆Hero

departs

DTest

EPass Test

FMagical Helper

GGuidance

HStruggle

IVictory

KLack is

liquidated

JBranding

⬇Hero

returns

PrPursuit

RsRescue

OArrival in Disguise

LFalse

Claims

MDifficult

Task

NSolution

QHero

recognized

ExImpostor exposed

TTrans-

figuration

UPunish-ment

WWedding

CCounter-

action

⬆Hero

departs

DTest

EPass Test

FMagical Helper

GGuidance

HStruggle

IVictory

KLack is

liquidated

JBranding

⬇Hero

returns

PrPursuit

RsRescue

Proppian Content Descriptors in an Augmented Annotation Schema for Fairy Tales

Example 2: The Magic Swan-Geese

αγβδ ABC↑ [D¬E¬F]³ G DEF HK↓ [PrDEF = Rs]³

Once upon a time a man and a woman lived with their daughter and small son.

"Dearest daughter," said the mother, "we are going to work. Look after your brother! Don't go out into the yard, be a good girl, and we'll buy you a handkerchief."

Page 9: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

27

0 4 19 23 28

1 5 16 20 24 29

2 6 8 10 12 14 17 21 25 30

3 7 9 11 13 15 18 22 26 31

αInitial

Situation

δInterdict. violated

OArrival in Disguise

LFalse

Claims

MDifficult

Task

NSolution

QHero

recognized

ExImpostor exposed

TTrans-

figuration

UPunish-ment

WWedding

βAbsen-tation

γInter-

diction

εInfo.

sought

ζInfo.

obtained

ηTrickery

θFall for Trick

AVillainy /

Lack

BMediation

CCounter-

action

⬆Hero

departs

DTest

EPass Test

FMagical Helper

GGuidance

HStruggle

IVictory

KLack is

liquidated

JBranding

⬇Hero

returns

PrPursuit

RsRescue

Proppian Content Descriptors in an Augmented Annotation Schema for Fairy Tales

Example 2: The Magic Swan-Geese

αγβδ ABC↑ [D¬E¬F]³ G DEF HK↓ [PrDEF = Rs]³

Once upon a time a man and a woman lived with their daughter and small son.

"Dearest daughter," said the mother, "we are going to work. Look after your brother! Don't go out into the yard, be a good girl, and we'll buy you a handkerchief."

Page 10: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

Proppian Content Descriptors in an Augmented Annotation Schema for Fairy Tales

A Two-Part Problem

Our aim is to annotate fairy tales (semi)automatically.

Page 11: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

Proppian Content Descriptors in an Augmented Annotation Schema for Fairy Tales

A Two-Part Problem

Our aim is to annotate fairy tales (semi)automatically.

How?

Page 12: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

Proppian Content Descriptors in an Augmented Annotation Schema for Fairy Tales

A Two-Part Problem

Our aim is to annotate fairy tales (semi)automatically.

How?Using what exactly?

Page 13: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

Proppian Content Descriptors in an Augmented Annotation Schema for Fairy Tales

A Two-Part Problem

Our aim is to annotate fairy tales (semi)automatically.

How?Using what exactly?

Annotation Schema Strategy

Page 14: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

Proppian Content Descriptors in an Augmented Annotation Schema for Fairy Tales

Annotation Schemes for Fairy Tales1: PftML (Proppian fairy tale Markup Language)

• Developed by Scott A. Malec

• Faithful to the 31 functions

• Inline XML annotation (paragraph / sentence-wise)

Drawbacks:

• Not very flexible

• Coarse-grained

Page 15: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

Proppian Content Descriptors in an Augmented Annotation Schema for Fairy Tales

Annotation Schemes for Fairy Tales1: PftML (Proppian fairy tale Markup Language)

• Developed by Scott A. Malec

• Faithful to the 31 functions

• Inline XML annotation (paragraph / sentence-wise)

Drawbacks:

• Not very flexible

• Coarse-grained

Page 16: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

Proppian Content Descriptors in an Augmented Annotation Schema for Fairy Tales

A Closer Look at a Proppian Function

βAbsentation

1

Subfunctions:β¹: Absentation of Eldersβ²: Death of Parentsβ³: Absentation of Youth

"Frame":• Performer of absentation• Form of absentation• Motivation

cf. FrameNet: Fillmore and Baker, A Frame Approach to Semantic Analysis (2010)

Page 17: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

Proppian "frames"

31 functions

7 characters

Proppian Content Descriptors in an Augmented Annotation Schema for Fairy Tales

Sources for PftML

Morphology of the Folktale

PftML

Page 18: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

Proppian Content Descriptors in an Augmented Annotation Schema for Fairy Tales

Annotation Schemes for Fairy Tales2: Our Approach: APftML (Augmented PftML)

• First "Propp complete" annotation scheme

• Will allow semi-automatic annotation of fairy tales

Prototype will be presented at

• CLARIN/DARIAH conference (Oct. 19-20, Vienna)

• and AMICUS workshop (Oct. 21, Vienna)

Page 19: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

Proppian "frames"

31 functions

7 characters

Proppian Content Descriptors in an Augmented Annotation Schema for Fairy Tales

"Propp complete"?

Morphology of the Folktale

PftML

APftML

Page 20: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

TEI

D-SPIN

Proppian "frames"

31 functions

7 characters

Proppian Content Descriptors in an Augmented Annotation Schema for Fairy Tales

Sources for APftML

Morphology of the Folktale

PftML

APftML

annotation standard

pipeline for linguistic

annotation

Page 21: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

TEI

D-SPIN

Proppian "frames"

31 functions

7 characters

Proppian Content Descriptors in an Augmented Annotation Schema for Fairy Tales

Sources for APftML

Morphology of the Folktale

PftML

APftML

sophisticated linking/referring infrastructure

TokensMorphology

POSConstituencyDependency

Page 22: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

Proppian Content Descriptors in an Augmented Annotation Schema for Fairy Tales

Annotation of The Magic Swan-Geese

The parents went off to work, and the daughter soon enough forgot what they had told her.

1. Keep Track of Characters

man

father

woman

mother

Page 23: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

Proppian Content Descriptors in an Augmented Annotation Schema for Fairy Tales

Annotation of The Magic Swan-Geese

The parents went off to work, and the daughter soon enough forgot what they had told her.

1. Keep Track of Characters

girl

daughter

Page 24: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

Proppian Content Descriptors in an Augmented Annotation Schema for Fairy Tales

Annotation of The Magic Swan-Geese

She put her little brother on the grass under a window and ran into the yard, where she played and got completely carried away having fun.

Violation of InterdictionInterdiction violatedPerson performing

Motivation

Don't go out into the yard

2. Keep Track of Functions & "Frames"

Page 25: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

Proppian Content Descriptors in an Augmented Annotation Schema for Fairy Tales

Annotation of The Magic Swan-Geese

She put her little brother on the grass under a window and ran into the yard, where she played and got completely carried away having fun.

Violation of InterdictionInterdiction violatedPerson performing

Motivation

2. Keep Track of Functions & "Frames"

Don't go out into the yard

Page 26: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

Proppian Content Descriptors in an Augmented Annotation Schema for Fairy Tales

Annotation of The Magic Swan-Geese

She put her little brother on the grass under a window and ran into the yard, where she played and got completely carried away having fun.

Violation of InterdictionInterdiction violatedPerson performing

Motivation

2. Keep Track of Functions & "Frames"

Don't go out into the yard

Page 27: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

Proppian Content Descriptors in an Augmented Annotation Schema for Fairy Tales

Ongoing Work

• Integration with linguistic and semantic resources

(Wiktionary, TEI annotation infrastructure for

narratives, WordNet, FrameNet, ProppOnto ontology)

• Implementation of coreference resolution

• Multilingual processing, using multilingual resources

• Extend ProppOnto with a linguistic model for

ontology labels, within project MONNET

(Multilingual Ontologies for Networked Knowledge)

Page 28: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

Proppian Content Descriptors in an Augmented Annotation Schema for Fairy Tales

...and they lived happily ever after.

Thank you for your attention!

Time for your questions.

Page 29: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

This work has been partially funded by the projects

CLARIN & D-SPIN: Annotation of Fairy Tales,

see http://www.clarin.eu/external/ and

http://weblicht.sfs.uni-tuebingen.de/

MONNET: Multilingual Ontologies, see

http://cordis.europa.eu/fp7/ict/language-

technologies/project-monnet_en.html

Proppian Content Descriptors in an Augmented Annotation Schema for Fairy Tales

Acknowledgements

Page 30: An Augmented Annotation Schema for Fairy Tales Using ...ilk.uvt.nl/LaTeCH2010/LPF/12_slides.pdf · An Augmented Annotation Schema for Fairy Tales Using Proppian Content Descriptors

Introduction: Vladimir A. Propp: Morphology of the Folktale (1968)

PftML: Scott A. Malec's notes on the development of PftML:

http://clover.slavic.pitt.edu/sam/propp/theory/propp.html (2002)

ProppOnto: Federico Peinado, Pablo Gervás, Belén Díaz-Agudo:

A Description Logic Ontology for Fairy Tale Generation (2010)

TEI: The Text Encoding Initiative: http://www.tei-c.org/

Proppian Content Descriptors in an Augmented Annotation Schema for Fairy Tales

References