40
Leveraging discourse information effectively for authorship attribution Elisa Ferracane, Su Wang, Raymond J. Mooney University of Texas at Austin

Leveraging discourse information effectively for

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Leveraging discourse information effectively for

Leveraging discourse information effectively for

authorship attributionElisa Ferracane, Su Wang, Raymond J. Mooney

University of Texas at Austin

Page 2: Leveraging discourse information effectively for

2

Task

• Authorship Attribution: identify the author of a text, given a set of author-labeled training texts.

Page 3: Leveraging discourse information effectively for

3

Authorship Attribution• Neural networks (e.g., character-level CNNs) have proven

very powerful…

• capture stylometric cues at the surface level

“My very photogenic mother died in a freak accident (picnic, lightning) when I was three...”

“But what principally attracted attention of Nicholas, was the old gentleman’s eye… Grafted upon the quaintness and oddity of his appearance, was something…”

Lolita, Nabokov

Nichola Nickleby, Dickens

Page 4: Leveraging discourse information effectively for

4

Authorship Attribution

• Authors also have particular rhetorical styles…

• But how do you incorporate discourse into a neural net?

Page 5: Leveraging discourse information effectively for

5

Our Contributions

1) How can you featurize discourse information?

2) How can you integrate discourse information into the network?

3) Can discourse help in SOTA model (bigram character CNN)?

Page 6: Leveraging discourse information effectively for

Q1: How can you featurize discourse information?

6

• Use an entity grid model (Barzilay & Lapata, 2008) with either:

• grammatical relations, or

• RST discourse relations

Page 7: Leveraging discourse information effectively for

Q1: How can you featurize discourse information?

7

(1) My father was a clergyman of the north of England, who was deservedly respected by all who knew him; and, in his younger days, lived pretty comfortably on the joint income of a small incumbency and a snug little property of his own.

(2) My mother, who married him against the wishes of her friends, was a squire’s daughter, and a woman of spirit.

(3) In vain it was represented to her, that if she became the poor parson’s wife, she must relinquish her carriage and her lady’s-maid, and all the luxuries and elegancies of affluence; which to her were little less than the necessaries of life.

Page 8: Leveraging discourse information effectively for

Q1: How can you featurize discourse information?

8

(1) My father was a clergyman of the north of England, who was deservedly respected by all who knew him; and, in his younger days, lived pretty comfortably on the joint income of a small incumbency and a snug little property of his own.

(2) My mother, who married him against the wishes of her friends, was a squire’s daughter, and a woman of spirit.

(3) In vain it was represented to her, that if she became the poor parson’s wife, she must relinquish her carriage and her lady’s-maid, and all the luxuries and elegancies of affluence; which to her were little less than the necessaries of life.

Page 9: Leveraging discourse information effectively for

Q1: How can you featurize discourse information?

9

(1) My father was a clergyman of the north of England, who was deservedly respected by all who knew him; and, in his younger days, lived pretty comfortably on the joint income of a small incumbency and a snug little property of his own.

(2) My mother, who married him against the wishes of her friends, was a squire’s daughter, and a woman of spirit.

(3) In vain it was represented to her, that if she became the poor parson’s wife, she must relinquish her carriage and her lady’s-maid, and all the luxuries and elegancies of affluence; which to her were little less than the necessaries of life.

Page 10: Leveraging discourse information effectively for

10

Q1: How can you featurize discourse information?

(1)

(2)

(3)

father mother

Barzilay and Lapata (2008)

row: sentencecolumn: salient entity

Page 11: Leveraging discourse information effectively for

Q1: How can you featurize discourse information?

11

(1) [My father]SUBJECT was a clergyman of the north of England, who was deservedly respected by all who knew him; and, in his younger days, lived pretty comfortably on the joint income of a small incumbency and a snug little property of his own.

(2) [My mother]SUBJECT, who married [him]OBJECT against the wishes of her friends, was a squire’s daughter, and a woman of spirit.

(3) In vain it was represented to her, that if [she]SUBJECT became the [poor parson]OTHER’s wife, she must relinquish her carriage and her lady’s-maid, and all the luxuries and elegancies of affluence; which to her were little less than the necessaries of life.

Page 12: Leveraging discourse information effectively for

Q1: How can you featurize discourse information?

12

(1) [My father]SUBJECT was a clergyman of the north of England, who was deservedly respected by all who knew him; and, in his younger days, lived pretty comfortably on the joint income of a small incumbency and a snug little property of his own.

(2) [My mother]SUBJECT, who married [him]OBJECT against the wishes of her friends, was a squire’s daughter, and a woman of spirit.

(3) In vain it was represented to her, that if [she]SUBJECT became the [poor parson]OTHER’s wife, she must relinquish her carriage and her lady’s-maid, and all the luxuries and elegancies of affluence; which to her were little less than the necessaries of life.

Page 13: Leveraging discourse information effectively for

13

Q1: How can you featurize discourse information?

(1) S -

(2) O S

(3) X S

father mother

Grammatical relations

Barzilay and Lapata (2008)

Page 14: Leveraging discourse information effectively for

Q1: How can you featurize discourse information?

14

• Discourse relations:• Rhetorical Structure Theory (RST)

• Divide a document into elementary discourse units (EDUs), usually clauses

• Organize EDUs into a tree structure:• edges are discourse relation types• node in a relation can be either the nucleus (more

“important”) or satellite

Page 15: Leveraging discourse information effectively for

Q1: How can you featurize discourse information?

15

if she became the poor parson’s wife, she must relinquish her carriage and her lady’s-maid, and all the luxuries and elegancies of affluence; which to her were little less than the necessaries of life.

Page 16: Leveraging discourse information effectively for

Q1: How can you featurize discourse information?

16

if she became the poor parson’s wife, she must relinquish her carriage and her lady’s-maid, and all the luxuries and elegancies of affluence; which to her were little less than the necessaries of life.

Page 17: Leveraging discourse information effectively for

Q1: How can you featurize discourse information?

17

if she became the poor parson’s wife,

she must relinquish her carriage and her lady’s-maid, and all the luxuries and elegancies of affluence;

which to her were little less than the necessaries of life.

Page 18: Leveraging discourse information effectively for

Q1: How can you featurize discourse information?

18

if she became the poor parson’s wife,

she must relinquish her carriage and her lady’s-maid, and all the luxuries and elegancies of affluence;

which to her were little less than the necessaries of life.

condition-ncondition-s

Page 19: Leveraging discourse information effectively for

Q1: How can you featurize discourse information?

19

if she became the poor parson’s wife,

which to her were little less than the necessaries of life.

condition-ncondition-s

she must relinquish her carriage and her lady’s-maid, and all the luxuries and elegancies of affluence;

interpretation-sinterpretation-n

Page 20: Leveraging discourse information effectively for

Q1: How can you featurize discourse information?

20

Page 21: Leveraging discourse information effectively for

21

Q1: How can you featurize discourse information?

(1)

background.N, TopicShift,

elaboration.S, background.S

-

(2) elaboration.Selaboration.N,

circumstance.N, TopicShift

(3) condition.Nattribution.S, condition.N,

interpretation.S

father mother

RST discourse relations

Feng and Hirst (2014)

Page 22: Leveraging discourse information effectively for

22

Q2: How can you integrate discourse information into the network?

• Use probability vector

• Use embeddings!

Page 23: Leveraging discourse information effectively for

Q2: How can you integrate discourse information into the network?

CNN without discourse

Ruder et al., 2016; Shrestha et al., 2017, Sari et al., 2017

Page 24: Leveraging discourse information effectively for

Q2: How can you integrate discourse information into the network?

CNN with discourse probability vector

Page 25: Leveraging discourse information effectively for

Q2: How can you integrate discourse information into the network?

CNN with discourse embeddings

Page 26: Leveraging discourse information effectively for

Q2: How can you integrate discourse information into the network?

• Use embeddings

• Local vs. Global

• Local: how are entities changing across contiguous sentences?

• Global: how is each entity changing across a document?

Page 27: Leveraging discourse information effectively for

27

father mother

Sequence:

Q2: How can you integrate discourse information into the network?

Local: by contiguous sentences

(1) S -

(2) O S

(3) X S

1 2

3 4

so, -s, ox, ss

Page 28: Leveraging discourse information effectively for

28

father mother

Sequence:

Q2: How can you integrate discourse information into the network?

Global: by entity

(1) S -

(2) O S

(3) X S

1 3

2 4

so,ox, -s, ss

Page 29: Leveraging discourse information effectively for

29

Datasets

Dataset # authors mean words/auth

mean words/text

IMDB62 62 349,004 349

Novel-50 50 709,880 2,000

Page 30: Leveraging discourse information effectively for

Results

30

1) How to featurize?grammatical relations

vs.RST discourse relations

F1

90

92.5

95

97.5

100

IMDB Novel-50

grammatical relationsRST discourse relations

Page 31: Leveraging discourse information effectively for

Results

31

1) How to featurize?grammatical relations

vs.RST discourse relations

F1

90

92.5

95

97.5

100

IMDB Novel-50

grammatical relationsRST discourse relations

Page 32: Leveraging discourse information effectively for

Results

32

2) How to integrate?probability vector

vs. discourse embedding

F1

90

92.5

95

97.5

100

IMDB Novel-50

probability vectordiscourse embedding

Page 33: Leveraging discourse information effectively for

Results

33

2) How to integrate?probability vector

vs. discourse embedding

F1

90

92.5

95

97.5

100

IMDB Novel-50

probability vectordiscourse embedding

Page 34: Leveraging discourse information effectively for

Results

34

2) How to integrate?localvs.

global

F1

91

93.25

95.5

97.75

100

IMDB Novel-50

local global

Page 35: Leveraging discourse information effectively for

Results

35

2) How to integrate?localvs.

global

F1

91

93.25

95.5

97.75

100

IMDB Novel-50

local global

Page 36: Leveraging discourse information effectively for

Results

36

3) Does discourse help?

It depends…

F1

90

92.5

95

97.5

100

IMDB

No discourse baselineprobability vectordiscourse embedding (gr. rels.)discourse embedding (RST)

Page 37: Leveraging discourse information effectively for

Results

37

3) Does discourse help?

Yes!

F1

95

96.25

97.5

98.75

100

Novel-50

No discourse baselineprobability vectordiscourse embedding (gr. rels.)discourse embedding (RST)

Page 38: Leveraging discourse information effectively for

Error Analysis• The least-represented author (Ambrose Bierce) obtains the

biggest improvement from discourse:

—Discourse feature is more robust with smaller, fewer samples compared to character bigrams

• Two authors who gained large improvements from discourse wrote a variety of genres (e.g., both supernatural horror and love stories)

—Character bigrams can’t generalize well to the different vocabularies, but discourse captures the similar rhetorical style

38

Page 39: Leveraging discourse information effectively for

Conclusion• Discourse improves authorship attribution over a

strong baseline of character-level CNN

• Embeddings of RST discourse relations at the global level perform the best

• Works better on longer documents

39

Page 40: Leveraging discourse information effectively for

Thank you!

[email protected]

Leveraging discourse information effectively for authorship

attribution