45
John Laudun UNIVERSITY OF LOUISIANA [email protected] http://johnlaudun.org/ Counting Tales Towards a Computational Narratology

Counting Tales: Towards a Computational Narratology

Embed Size (px)

Citation preview

Page 1: Counting Tales: Towards a Computational Narratology

John LaudunUNIVERSITY OF LOUISIANA

[email protected]://johnlaudun.org/

Counting Tales Towards a

Computational Narratology

Page 2: Counting Tales: Towards a Computational Narratology

A Pirate in a Tree

Page 3: Counting Tales: Towards a Computational Narratology
Page 4: Counting Tales: Towards a Computational Narratology
Page 5: Counting Tales: Towards a Computational Narratology

Immigration Layers

African

Colonial French

GermanAcadianEnglish

African

German

Acadian

English

GENERAL HISTORICAL PATTERN

RAYNE IMMIGRATION

Page 6: Counting Tales: Towards a Computational Narratology

Rayne’s Location within Louisiana

Page 7: Counting Tales: Towards a Computational Narratology

• Population: 8500.• 34% of population in 2000 was African American. • African Americans tend to live in discrete neighborhoods.• Blacks and whites interact in commercial sectors but not in residential areas.• African American residential areas have a number of features that allow for the creation

and maintenance of a distinct folk culture.

Rayne, Louisiana (雷恩)

Page 8: Counting Tales: Towards a Computational Narratology

Oscar Babineaux• Lived in Rayne his entire life.

• Well known “shit talker.”

• Talking shit can include: insults, jokes, and a rhymed form of speaking known by folklorists as “toasts” ... and now we know it also includes legends.

• Shit talking has mostly been investigated as an urban phenomenon, but it would appear to have been active in rural areas since at least the days of Zora Neale Hurston’s Mules and Men.

Page 9: Counting Tales: Towards a Computational Narratology

Summary of Legend• Babineaux stops by his family home.

• People are digging in the backyard for treasure.

• Joins family in prayer..

• He and his nephew take water to diggers, encounter a pirate in a tree.

• Pirate asks for something to drink.

• They give him some.

• Pirate asks again.

• They get scared and don’t give him anything.

• Pirate threatens them.

• They run. A shovel flies through the air and lands in a tree.

Page 11: Counting Tales: Towards a Computational Narratology

Like I said my family was weird. They liked to dig for money and stuff. Said my grandfather had left us some money. And they was digging for it. So one day we went, and I was at work, so I can see, we at a country spot, like our property.

So I can see a lot of people dressed in white. So I’m curious me. I said, “Well, shit, what the hell is everybody doing out there dressed

in white? I wanna see.” So I goes out there. So they tell me, “You’re working right now, just go

home come back. You know, come back after work.”

Page 12: Counting Tales: Towards a Computational Narratology

So I goes back, man, after work. So, they all in the house. We all praying man, everyone’s on their knees praying. They got an excavator in the back yard, digging. [Laughs.] You understand? Find this money, I guess. We’re on our knees, man, we’re praying. It’s like in the pit of the summer, like here. No wind nothing.

They had a wind come through the house. That wind was so strong my aunt was holding onto the door like that, and both her legs was in the air. That’s how strong the wind was. In the house.

So they said … they picked me, my nephew — the one I was telling you that talk all that shit, and my little niece to go bring some water to the workers in back, the one that was doing the work. So we got to walking. We passed on the side of the house to bring them.

Page 13: Counting Tales: Towards a Computational Narratology

So my nephew said, “Say man you see that guy in the tree?”

I said, “Man fuck I don’t see nobody in no tree.”

He said, “Yeah man he be right there sitting on that limb.”

I said, “I don’t see nobody man.”

I’m getting scared now.

Man I don’t see nobody.

But he’s seeing this, you know.

So he said— I said, “How he look?”

“It’s a guy,” he said, “it’s a guy dressed in a pirate suit, man.”

He said, “He got a pirate hat on. He got a pirate jacket.” And he started talking to

him.

The guy in the tree started talking to him, while he’s telling me this. But the guy in

the tree is telling him: shut up don’t tell me that.

Page 14: Counting Tales: Towards a Computational Narratology

So he telling me, “Man, look he right there. You can’t see him? Look he right

there on that branch.”

He say, “He want something more to drink.”

You know, because what they had did: they’d put a bowl in the back yard, under

this tree, with some alcohol in it. You understand?

And I don’t know if it was the sun that would dissolve it, but it would be gone.

Okay, so he say he say, “Man, he want another drink.”

So I said, “Fuck man don’t tell me that .”

I wanna get back in the house.

I said, “I don’t see nobody up there.”

So we kept on walking. We went out there. We brung them some water. So on

our way back.

Page 15: Counting Tales: Towards a Computational Narratology

Look at him.

He say, “See you, you son of a bitch.”

He say, “You don’t wanna give me another drink, huh?”

He say, “You gonna be just like me.”

He say, “You see this here peg leg?”

He say, “You going to be just like me.”

He say, “For this out here y’all are going to have to lose something.”

So, man, it got kind of scared. We started walking fast. By the time we got to

the house, I broke out a run. A shovel, man, come from the back of the house. I

mean full force.That shovel stuck in that tree so deep we had to dig it out with an

axe. It stuck … you know with a shovel, it’s hard to stick a shovel into anything.

That shovel went inside the tree halfway.

Page 16: Counting Tales: Towards a Computational Narratology

Method & Madness

Page 17: Counting Tales: Towards a Computational Narratology

0

275

550

825

1100

LOH 164 LOH 165 LOH 160 LAU 14 LAU 13 LOH 157 LOH 162 ANC 88 LOH 161 LOH 159 LOH 163 LOH 162b LOH 158 ANC 91 ANC 90 ANC 89

TotalUnique

Figure 1: Graph of Length and Lexical Diversity of Oral Legend Texts

Page 18: Counting Tales: Towards a Computational Narratology

TEXT TOTAL WORDS UNIQUE WORDS PERCENTAGE

ANC 88 331 142 0.43ANC 89 153 83 0.54ANC 90 175 86 0.49ANC 91 176 105 0.60BRO-01 117 74 0.63BRO-02 67 50 0.75BRO-03 136 90 0.66BRO-04 122 79 0.65LAU 13 375 175 0.47LAU 14 655 207 0.32LOH 157 364 166 0.46LOH 158 193 106 0.55LOH 159 282 144 0.51LOH 160 761 287 0.38LOH 161 295 129 0.44LOH 162 332 144 0.43LOH 162b 194 108 0.56LOH 163 209 114 0.55LOH 164 1025 318 0.31LOH 165 905 277 0.31

Details of Lexical Diversity of Oral Texts

Page 19: Counting Tales: Towards a Computational Narratology

0

50

100

150

200

morris-02 goble-03 goble-04 davis-02 morris-01 goble-05 davis-03 goble-01 goble-02 davis-01 bridgwaters-01

TotalUnique

Figure 1: Graph of Length and Lexical Diversity of Oral History Texts

Page 20: Counting Tales: Towards a Computational Narratology

0

250

500

750

1000

1250

1500

LOH 164 LOH 160 LAU 13 LOH 162 LOH 161 LOH 163 LOH 162b ANC 91 goble-03 goble-04 BRO-04 davis-02 goble-05 goble-01 davis-01 bridgwaters-01

Total Words for Legend Unique Words for LegendTotal Words for Oral History Unique Words for Oral History

Page 21: Counting Tales: Towards a Computational Narratology

Zipf’s law states that given some corpus of natural language utterances, the frequency of any word is inversely proportional to its rank in the frequency table.

The most frequent word will occur approximately twice as often as the second most frequent word and three times as often as the third.

LAW

POWER

CURVE

Page 22: Counting Tales: Towards a Computational Narratology

Most Frequent Words in Legendsthe 251

and 202

was 201

he 191

it 175

to 154

a 143

they 135

that 121

i 121

there 105

of 102

in 100

said 77

you 67

had 66

this 63

so 63

on 48

but 47me 46went 45out 44him 42well 40we 40up 40money 40back 39one 38man 38got 36his 34all 34with 33where 33just 33know 32see 29

when 28them 28like 28go 27told 26she 26little 26what 25old 25now 25my 24here 23some 22about 22time 21over 21come 21at 21something 20house 19

Page 23: Counting Tales: Towards a Computational Narratology

LEXICAL DIVERSITYLEXICAL DIVERSITYLEXICAL DIVERSITY LENGTHLENGTHLENGTHCOLLECTION AVERAGE MINIMUM MAXIMUM AVERAGE MININUM MAXIMUM

Legends 0.46 0.31 0.60(0.75) 343 (67)

153 1025

Oral Histories 0.65 0.55 0.79 111 43 196

Comparison of Lengths and Lexical Diversity

Page 24: Counting Tales: Towards a Computational Narratology

Why count words?•Because we haven’t, and we should have. •In oral discourse, especially in traditional oral discourse, every

word matters, and we should have a better understanding of what each word does.

•Word counts begin to give us a baseline for understanding a greater variety of human verbal activity. (That is, let’s join up with the linguists — or, rather, they are already getting there with stylistics, wouldn’t it be nice not to get pushed out?)

Page 25: Counting Tales: Towards a Computational Narratology

Potential Problem•Word counts depend on accurate transcripts. [Computer scientists

have a saying: “Garbage in, garbage out.”] •The good news is that we have four decades of talking about this and

doing it.•What this means is that we can take our accurate transcriptions and

begin to really understand how many and which words humans need to create their reality, or alternate realities.

Page 26: Counting Tales: Towards a Computational Narratology

Why Count Words? (Redux)• We can assume that all genres will have a wide variety of forms, but, looked at

from a statistical perspective, are there some patterns there that might lead us to more interesting investigations?

• Will folktales emerge as typically longer, and will that length be a function of the necessity of setting up a more complex storyworld?

• E.g., Ray Hicks’ telling of “Jack and the Fire Dragon” is 1977 words. (See handout.)

• What about lexical diversity and frequencies?• And how do these simple matters of counting relate to morphologies and

semantic networks?

Page 27: Counting Tales: Towards a Computational Narratology

After We Count Words...• I said earlier that “in oral discourse, every word matters.” That’s not entirely true.• As we saw in the previous sets of statistics, there are a lot of words that don’t matter very

much when we think about the meaning of a story:

the 251

and 202

was 201

he 191

it 175

to 154

a 143

they 135

that 121

i 121

there 105

of 102

in 100

said 77

you 67

had 66

this 63

so 63

on 48

but 47

me 46

went 45

out 44

him 42

well 40

we 40

up 40

money 40

back 39

one 38

man 38

got 36

his 34

all 34

with 33

where 33

just 33

know 32

see 29

Page 28: Counting Tales: Towards a Computational Narratology

• Linguists call those words that contribute to the structure of an utterance but not its meaning function words.

• They often eliminate these words from consideration by using stop lists when running a computer program to assess various dimensions of a text, like calculating the lexical diversity of texts.

• A word like said, for example, is typically dropped from consideration by linguists. In my own work, said is incredibly important both for its syndetic and traditionalizing functions (Laudun 2012).

• What remains are content words.

Page 29: Counting Tales: Towards a Computational Narratology

Content Words in Context g to see . BA : There was n't money buried there ? SF : Supposedly. That 's wh

yed around his grave a lot . He was buried , still buried , where we lived . H

grave a lot . He was buried , still buried , where we lived . He was buried in

ll buried , where we lived . He was buried in the yard where I lived . They ha

m . Killed him . His wife and Billy buried him right there . That night as it

ppi . That night , supposedly , she buried her money on the other side of Jean

nd early American coins . They were buried there . They said it was Lafitte .

West was the outlaw that did that ( buried the money ) , according to the Mexi

s told the story that this money is buried in there. By the time they made the

n't remember where it was. They had buried all this money he had. What it was

tly , this money was supposed to be buried there. That money , as far as anybo

re 's always been a claim that they buried whatever they had somewhere in that

ime ago ( pause ) they claimed they buried their money. That was back when the

was seven-foot deep. You could 've buried a car in it. There was some of them

hat , well usually treasure , money buried around. These three fellows came up

ese treasure things, like you hunt buried treasure with. He had one of those

Page 30: Counting Tales: Towards a Computational Narratology

Key Words in Context allows us to see, at a glance, exactly where words occur in a sentence, but what if we want to see larger patterns. For example, what if we wanted to see what words are regularly found together: collocated.

Page 31: Counting Tales: Towards a Computational Narratology
Page 32: Counting Tales: Towards a Computational Narratology

Key words can also be said to co-occur across several texts, giving us a better sense of how texts are related to each other. Currently, this kind of analysis, often used in topic modeling, treats texts as bags of words.

Page 33: Counting Tales: Towards a Computational Narratology
Page 34: Counting Tales: Towards a Computational Narratology

Topic modeling•MALLET for LDA (Latent Dirichlet Allocation)•LSA (Latent Semantic Analysis) is built into Mac OS X, available

elsewhere.Word Statistics / KWIC / Word Placement

•Python, Python NLTK•For everything else, R.

All of this information will be in the bibliography.

Software

Page 35: Counting Tales: Towards a Computational Narratology

What Python looks like

#! /usr/bin/env python

import globimport re

files = {}for fpath in glob.glob("*.txt"): with open(fpath) as f: fixed_text = re.sub("[^a-zA-Z'-]"," ",f.read()) files[fpath] = (len(fixed_text.split()),len(set(fixed_text.split()))) print "Total Words:" , len(fixed_text.split()) print "Total Unique:",len(set(fixed_text.split()))

with open("wordstats.csv", "w") as f: for fname in files: print >> f , "%s,%s,%s"%(fname,files[fname][0],files[fname][1])

Page 36: Counting Tales: Towards a Computational Narratology

That doesn’t look like folklore studies...

Page 37: Counting Tales: Towards a Computational Narratology

Neither did Propp or Lévi-Strauss...

Page 38: Counting Tales: Towards a Computational Narratology
Page 39: Counting Tales: Towards a Computational Narratology
Page 40: Counting Tales: Towards a Computational Narratology
Page 41: Counting Tales: Towards a Computational Narratology

So what?

Page 42: Counting Tales: Towards a Computational Narratology

•Not as we currently practice it.•We have gotten very good at the ethnographic description of texts in

context. We understand motivated behavior very well.•We left behind the study of the human mind. (E.g., structuralism). •Cognitive studies, and the adjacent field of creativity studies, have

exploded in recent years and we need to make sure we are part of that conversation.

None of this looks like folklore studies!

Page 43: Counting Tales: Towards a Computational Narratology

A Future for Folklore Studies?• Performance theory / ethnomethodologies / ethnography of speaking give us some

of the most accurate accounts of human verbal behavior in the world.• Lord’s work on oral formulas is now foundational work in cognitive science.

See: David Rubin’s Memory in Oral Tradition. (Sample chapter is in packet of papers.)

• Also in packet: computer scientists working on morphology, physicists mapping of myth.

Sorry for letter-sized pages: PDFs will be in Dropbox folder.

Page 44: Counting Tales: Towards a Computational Narratology
Page 45: Counting Tales: Towards a Computational Narratology

Laudun, John. 2012. “Talking Shit” in Rayne: How Aesthetic Features Reveal Ethical Structures. Journal of American Folklore 125 (497): 304–326.