46
KIT – University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association 1 Institute of Applied Informatics and Formal Description Methods (AIFB), Karlsruhe, Germany www.kit.edu A language-independent method for the extraction of RDF verbalization templates Basil Ell , 1 Andreas Harth 1 8 th International Natural Language Generation Conference 20 June 2014, Philadelphia, PA, USA

A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Embed Size (px)

DESCRIPTION

This presentation was given at the 8th International Natural Language Generation Conference (INLG12014), Philadelphia, Pennsylvania, and is related the publication of the same title. With the rise of the Semantic Web more and more data become available encoded using the Semantic Web standard RDF. This representation is faced towards machines: designed to be easily processable by machines it is difficult to understand by non-experts. Transforming RDF data into human-comprehensible text would facilitate non-experts to assess this information. In this paper we present a language-independent method for extracting RDF verbalization templates from a parallel corpus of text and data. Our method is based on distant-supervised simultaneous multi relation learning and frequent maximal subgraph pattern mining. We demonstrate the feasibility of this method on a parallel corpus of Wikipedia articles and DBpedia data for English and German. A preprint of the publication is available at http://km.aifb.kit.edu/sites/bridge-patterns/Ell_Harth_INLG2014_preprint.pdf

Citation preview

Page 1: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

KIT – University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association

1 Institute of Applied Informatics and Formal Description Methods (AIFB), Karlsruhe, Germany

www.kit.edu

A language-independent method for the extraction of RDF verbalization templates

Basil Ell,1 Andreas Harth1

8th International Natural Language Generation Conference20 June 2014, Philadelphia, PA, USA

Page 2: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

2

Motivation

More and more data openly available as RDF

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Linked Open Data initiative

Page 3: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

3

Motivation

More and more data openly available as RDF

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Linked Open Data initiative

Search Engine

keywords, questions, etc.

Text

NLG

Page 4: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

4

Motivation

More and more data openly available as RDF

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Linked Open Data initiative

Search Engine

keywords, questions, etc.

Text

NLG

Encyclopedia or Google Knowledge Graph

Textual description of a thing

NLG

Page 5: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

5

Example RDF data - Triples

Subject Predicate Object

dbr:Curtain_(Novel) dbo:author dbr:Agatha_Christie

dbr:Curtain_(Novel) rdf:type dbo:Book

dbr:Curtain_(Novel) rdfs:label "Curtain (novel)"@en

dbr:Curtain_(Novel) dbp:releaseDate "September 1975"@en

dbr:Curtain_(Novel) rdf:type dbo:Writer

dbr:Curtain_(Novel) rdfs:label "Agatha Christie"@en

dbo:Book rdfs:label "book"@en

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 6: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

6

Example RDF data - Graph

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 7: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

7

Overview

Motivation

RDF Verbalization Templates

Automatic Template Extraction

Evaluation

Related Work

Summary

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 8: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

8

RDF VERBALIZATION TEMPLATES

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 9: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

9

RDF Verbalization Template (1/2)

Graph pattern (GP)

Sentence pattern (SP)

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 10: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

10

RDF Verbalization Template (1/2)

Graph pattern (GP)

Sentence pattern (SP)

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 11: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

11

RDF Verbalization Template (2/2)

GP represented as SPARQL

query

SELECT ?book_label ?book_type_label ?author_label ?book_rDWHERE { ?book dbo:author ?author . ?book dbp:releaseDate ?book_rD . ?book rdf:type ?book_type . ?book_type rdfs:label ?book_type_label . ?book rdfs:label ?book_label . ?author rdfs:label ?author_label . ?author rdf:type dbo:Writer .}

book_label = “Curtain (novel)"book_type_label = "book"author_label = "Agatha Christie"book_rD = "September 1975"

Curtain is a book by Agatha Christie published in September 1975.

Query results

Verbalization result

Subject Predicate Object

dbr:Curtain_(Novel) dbo:author dbr:Agatha_Christie

dbr:Curtain_(Novel) rdf:type dbo:Book

dbr:Curtain_(Novel) rdfs:label "Curtain (novel)"@en

dbr:Curtain_(Novel) dbp:releaseDate "September 1975"@en

dbr:Agatha_Christie rdf:type dbo:Writer

dbr:Agatha_Christie rdfs:label "Agatha Christie"@en

dbo:Book rdfs:label "book"@en

RDF data

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 12: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

12

AUTOMATIC TEMPLATE EXTRACTION

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 13: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

13

Template Extraction (1/6) - Overview

Parallel text-data corpus RDF verbalization templates

1. Sentence Collection2. Text-Data Alignment3. Abstraction4. Grouping5. Pattern Mining6. Template Creation

     

     

  Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 14: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

14

Template Extraction (1/6) - Overview

Parallel text-data corpus RDF verbalization templates

1. Sentence Collection2. Text-Data Alignment3. Abstraction4. Grouping5. Pattern Mining6. Template Creation

Experiment:Text from Wikipedia

Data from DBpedia

10 Virtual Machines8 vCPUs

8GB RAM

40GB Disk

Extraction ran for 2 weeksBasil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 15: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

15

Template Extraction (2/6) - Features

Distant-supervised No hand-labeled training data required

Simultaneus multi-relation learning Simultaneously learning all relations in a sentence

Frequent maximal subgraph pattern mining Identify commonalities among RDF graph patterns

Language independent Does not rely on syntactic parsing

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 16: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

16

Example Template (1/2)

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 17: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

17

Example Template (2/2)

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 18: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

18

Template Extraction (3/6) - Alignment

label

Sentencem1

i

entity

literal

i

i

identified entity

identified literal

m1 modifier

matched string

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 19: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

19

Template Extraction (3/6) - Alignment

label

Sentencem1

i

entity

literal

i

i

identified entity

identified literal

m1 modifier

matched string

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 20: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

20

Template Extraction (3/6) - Alignment

label

Sentencem1

m2 m3

i

entity

literal

i

i

identified entity

identified literal

m1 modifier

matched string

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 21: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

21

Template Extraction (3/6) - Alignment

label

Sentencem1

m2 m3

i

i

i

entity

literal

i

i

identified entity

identified literal

m1 modifier

matched string

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 22: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

22

Template Extraction (3/6) - Alignment

label

label

Sentencem1

m4

m2 m3

i

i

i

entity

literal

i

i

identified entity

identified literal

m1 modifier

matched string

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 23: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

23

Template Extraction (3/6) - Alignment

label

label

Sentencem1

m4

m2 m3

i

i

i

i

entity

literal

i

i

identified entity

identified literal

m1 modifier

matched string

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 24: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

24

Template Extraction (3/6) - Alignment

label

label

Sentencem1

m4

m2 m3

i

i

i

i

entity

literal

i

i

identified entity

identified literal

m1 modifier

matched string

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 25: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

25

Template Extraction (3/6) - Alignment

label

label

Sentencem1

m4

m2 m3

m5

i

i

i label

i

entity

literal

i

i

identified entity

identified literal

m1 modifier

matched string

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 26: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

26

Template Extraction (3/6) - Alignment

label

label

Sentencem1

m4

m2 m3

m5

i

i

i label

i

i

entity

literal

i

i

identified entity

identified literal

m1 modifier

matched string

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 27: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

27

Template Extraction (3/6) - Alignment

label

label

Sentencem1

m4

m2 m3

m5

i

i

i label

i

i

entity

literal

i

i

identified entity

identified literal

m1 modifier

matched string

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 28: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

28

Template Extraction (3/6) - Alignment

label

label

Sentencem1

m4

m2 m3

m5

i

i

i label

i

i

entity

literal

i

i

identified entity

identified literal

m1 modifier

matched string

Language independent approach:-> no syntactic parsing

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 29: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

29

Template Extraction (4/6) – Abstraction

Abstraction 1:

Abstraction 2:

Hypothesis graph pattern 1

Hypothesis graph pattern 2

pattern 1

pattern 2

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 30: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

30

Template Extraction (5/6) - Grouping

'"{V1}" is a short story by {V2}.': abstraction-64451-1 abstraction-88393-1 abstraction-4732-1 abstraction-50480-1

'"{V1}" is a single by American {V9} {V4} {V8}.': abstraction-22205-1 abstraction-22205-3 abstraction-72533-1 abstraction-127891-2

'{V1} (born {V2}) is a German footballer.': abstraction-86372-1 abstraction-86415-1 abstraction-135340-5 abstraction-140464-2

Hypothesis graph patterns with equivalent sentence pattern

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Group graph patterns with equivalent sentence patterns:

Page 31: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

31

Template Extraction (6/6) - fmSpan

fmSpan - Frequent maximal subgraph pattern mining

Input:Set of graph patterns

Minimal coverage value: c

Output: Set of graph patternsEach graph pattern

Is subgraph to at least c graph patterns (→ frequent)

Cannot be extended while maintaining coverage (→ maximal)

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 32: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

32

EVALUATION

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 33: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

33

Evaluation (1/4) - Experiment

88,708,622 triples4,004,478 English documents 716,049 German documents

3,811,992 English sentences 794,040 German sentences

3,434,108 abstracted English sentences 530,766 abstracted German sentences (with at least two identified entities)

#groups≥5#templates #all groups

en 4569 3816 686,687

de 2130 1250 269,551

Parallel text-data corpus:

( , )

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 34: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

34

Evaluation (2/4) - Coverage

1.E+00-1.E+01

1.E+01-1.E+02

1.E+02-1.E+03

1.E+03-1.E+04

1.E+04-1.E+05

1.E+05-1.E+06

1.E+06-1.E+07

1.E+07-1.E+08

0

50

100

150

200

250

300

350

#en#de

How often can a template be

applied?

About 300 templates where each template can be used to verbalize between 10,000 and 100,000 subgraphs.

1 –

10

10 –

100

100

– 1

000

100

0 –

10

,00

0

10,0

00 –

100

,00

0

100

,00

0 –

1,0

00,0

00

1,00

0,00

0 –

10,

000,

000

10,0

00,0

00

– 1

00,

000,

000

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 35: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

35

Evaluation (3/4)

(1) (2) (3) (4)0

40

80

120

160

Accuracy (1)

en de

(1) (2) (3) (4)0

5

10

15

20

Accuracy (2)

en de

Is everything that is expressed in the graph

pattern also expressed in the sentence pattern?

Is everything that is expressed in the

sentence pattern also expressed in the graph

pattern?

Measured for each triple pattern within the GP:(1) The triple pattern is explicitly expressed(2) The triple pattern is implied(3) The triple pattern is not expressed(4) Unsure

(1) Everything is expressed(2) Most things are expressed(3) Some things are expressed(4) Nothing is expressed

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

10 English templates, 10 German templates,6 evaluators, 200 verbalizationsUser study

Page 36: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

36

Evaluation (4/4)

(1) (2) (3) (4)0

50

100

150

200

250

Syntactical Correctness

en de

(1) (2) (3) (4) (5)0

50

100

150

200

250

300

Understandability

en de

How syntactically correct are

verbalizations?

How understandable are

verbalizations?

(1) Completely syntactically correct(2) Almost syntactically correct(3) Some syntactical errors(4) Strongly syntactically incorrect

(1) The meaning is clear(2) The meaning is clear, but there are some problems in word usage, and/or style(3) The basic thrust is clear, but the evaluator is not sure of some detailed parts because of word usage problems.(4) Contains many word usage problems, and the evaluator can only guess at the meaning(5) Cannot be understood at all

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 37: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

37

RELATED WORK

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 38: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

38

Related Work (1/4)

(Welty et al., 2010)Focus on IE

Input sentences are parsed

Regard relations between proper nouns only

Does not consider a graph of relations

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 39: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

39

Related Work (2/4)

(Duma and Klein, 2013)Focus on NLG

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 40: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

40

Related Work (3/4)

(Gerber and Ngomo, 2011)Focus on IE

< ’s acquisition of > pattern for property subsidiary

“Google’s acquisition of Youtube comes as online video is really starting to hit its stride.”

relation expressed by string between entities

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 41: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

41

Related Work (4/4)

Distant supervision(Craven and Kumlien, 1999), (Bunescu and Mooney, 2007), (Carlson et al., 2009), (Mintz et al., 2009), (Welty et al., 2010), (Hoffmann et al., 2011), (Surdeanu et al., 2012)

Simultaneus multi-relation learning(Carlson et al., 2009)

Page 42: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

42

SUMMARY

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 43: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

43

Summary

Introduced RDF verbalization templates

Introduced template extraction approachDistant-supervised

Language independent

Simultaneous multi-relation learning

Frequent maximal subgraph pattern mining

EvaluationLarge parallel text-data corpus for en and de

Good syntactical correctness & understandability

Accuracy needs to be improved in future work

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 44: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

44

Thank you for your attention!

The authors acknowledge the support of the European Commission's Seventh Framework ProgrammeFP7-ICT-2011-7 (XLike, Grant 288342).

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

http://km.aifb.kit.edu/sites/bridge-patterns/INLG2014/

Page 45: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

45

References (1/2)

Razvan Bunescu and Raymond Mooney. 2007. Learning to extract relations from the web using minimal supervision. In Annual meeting-association for Computational Linguistics, volume 45, pages 576–583.

Andrew Carlson, Justin Betteridge, Estevam R Hruschka Jr, and Tom M Mitchell. 2009. Coupling semi-supervised learning of categories and relations. In Proceedings of the NAACL HLT 2009 Workshop on Semi-supervised Learning for Natural Language Processing, pages 1–9. Association for Computational Linguistics.

Mark Craven and Johan Kumlien. 1999. Constructing biological knowledge bases by extracting information from text sources. In Thomas Lengauer, Reinhard Schneider, Peer Bork, Douglas L. Brutlag, Janice I. Glasgow, Hans-Werner Mewes, and Ralf Zimmer, editors, ISMB, pages 77–86. AAAI.

Daniel Duma and Ewan Klein, 2013. Generating Natural Language from Linked Data: Unsupervised template extraction, pages 83–94. Association for Computational Linguistics, Potsdam, Germany.

Daniel Gerber and A-C Ngonga Ngomo. 2011. Bootstrapping the linked data web. In 1st Workshop on Web Scale Knowledge Extraction @ International Semantic Web Conference, volume 2011.

Raphael Hoffmann, Congle Zhang, Xiao Ling, Luke Zettlemoyer, and Daniel S Weld. 2011. Knowledge-based weak supervision for information extraction of overlapping relations. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, pages 541–550. Association for Computational Linguistics.

Basil Ell - A language-independent method for the extraction of RDF verbalization templates

Page 46: A language-independent method for the extraction of RDF verbalization templateslization - ppt spli-t

Institute of Applied Informatics and Formal Description Metthods (AIFB)

46

References (2/2)

Mike Mintz, Steven Bills, Rion Snow, and Dan Jurafsky. 2009. Distant supervision for relation extraction without labeled data. Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - ACL-IJCNLP 09, pages 1003–1011.

Mihai Surdeanu, Julie Tibshirani, Ramesh Nallapati, and Christopher D Manning. 2012. Multi-instance multi-label learning for relation extraction. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 455–465. Association for Computational Linguistics.

Chris Welty, James Fan, David Gondek, and Andrew Schlaikjer. 2010. Large scale relation detection. In Proceedings of the NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading, pages 24–33. Association for Computational Linguistics.

Basil Ell - A language-independent method for the extraction of RDF verbalization templates